OpenAI’s ChatGPT has been under scrutiny for its legal research capabilities, and a recent test by The Verge highlights both its strengths and limitations. Senior tech and policy editor Adi Robertson put the chatbot’s new “deep research” feature to the test, tasking it with summarizing recent court rulings related to Section 230 of the Communications Decency Act. The goal was to evaluate how well AI could handle one of the most frequently misunderstood laws on the internet.
Read the original story on The Verge.
The results were mixed. ChatGPT correctly identified and summarized real legal cases, avoiding the common AI pitfall of fabricating citations. The output was formatted well, with footnotes and explanations that added clarity to the research process. However, the chatbot failed to capture the full scope of legal developments from the past five years, omitting crucial rulings from 2024. Legal expert Eric Goldman, who reviewed the report, noted that while the chatbot’s case selection was reasonable, its omissions skewed the overall picture.
Goldman also pointed out that ChatGPT struggled to account for broader trends influencing Section 230 litigation, such as shifting judicial attitudes and political pressure against tech companies. While the bot delivered technically accurate summaries, it lacked the deeper contextual analysis that human experts provide.
Interestingly, other users at The Verge encountered similar gaps in their reports but were able to correct them by explicitly requesting data from 2024. This raises questions about how ChatGPT prioritizes recent information, given that OpenAI promotes deep research as a tool capable of accessing up-to-date web sources. The inconsistencies suggest that while the system is advancing, it still requires careful oversight and refinement.
For legal professionals and researchers, AI-driven tools like ChatGPT’s deep research feature may offer valuable starting points, but they aren’t yet reliable substitutes for expert analysis. OpenAI acknowledges that hallucination issues persist, and as this test shows, even when the bot avoids outright misinformation, its selective omissions can be just as misleading.
As AI research tools continue to evolve, their ability to balance accuracy, completeness, and contextual awareness will be critical. For now, users should approach them with caution—useful for preliminary research, but not a replacement for human expertise.