We all know that downloading books without paying for it is illegal. But what if technology would just “answer” you with nearly half of a very popular story? Is it still illegal? Well, this is what is happening as you’re reading this. The AI world is facing criticism and backlash after researchers found that Meta’s Llama 3-1 70B model can provide you with over 40% of “Harry Potter and the Sorcerer’s Stone” nearly word for word! And now they might be in trouble. Keep on reading to discover more about it.
This level of memorization is far beyond what experts had assumed large language models were capable of. Earlier models typically retained only fragments of copyrighted text. But Llama 3.1 can output full pages nearly literally, raising urgent questions about how it was trained. Another question raised is whether that training process crossed legal boundaries or not.
Pattern Learning or Plain Copying?
For years, AI companies have claimed that their models don’t “remember” exact text, but rather learn abstract patterns from vast datasets. This study throws that narrative into serious doubt. The researchers fed the Llama model carefully crafted prompts and were able to retrieve huge chunks of Harry Potter, sometimes with only a few words missing. The model’s recall rate, especially compared to earlier LLMs like GPT-3 or PaLM, is staggering, and potentially damning.

If courts determine this kind of recall qualifies as copyright infringement, it could force AI companies to radically rethink how open-source models are trained, stored, and shared.
Meta Ai might face legal trouble!
The implications go beyond Meta. If a model can memorize and reproduce copyrighted books in bulk, it may be seen as a derivative work, which would make distribution and usage a legal minefield. That poses a major risk to the open-source AI movement, which depends on transparent model weights and reproducibility.
- Feature image: Manuel Orbegozo/Reuters