âMeta has treated the so-called âpublic availabilityâ of shadow datasets as a get-out-of-jail-free card, notwithstanding that internal Meta records show every relevant decision-maker at Meta, up to and including its CEO, Mark Zuckerberg, knew LibGen was âa dataset we know to be pirated,ââ the plaintiffs allege in this motion. (Originally filed in late 2024, the motion is a request to file a third amended complaint.)
In addition to the plaintiffsâ briefs, another filing was unredacted in response to Chhabriaâs orderâMetaâs opposition to the motion to file an amended complaint. It argues that the authorsâ attempts to add additional claims to the case are an âeleventh-hour gambit based on a false and inflammatory premiseâ and denies that Meta waited to reveal crucial information in discovery. Instead, Meta argues it first revealed to the plaintiffs that it used a LibGen dataset in July 2024. (Because much of the discovery materials remain confidential, it is difficult for WIRED to confirm that claim.)
Metaâs argument hinges on its claim that the plaintiffs already knew about the LibGen use and shouldnât be granted additional time to file a third amended claim when they had ample time to do so before discovery ended in December 2024. âPlaintiffs knew of Metaâs downloading and use of LibGen and other alleged âshadow librariesâ since at least mid-July 2024,â the tech giantâs lawyers argue.
In November 2023, Chhabria granted Metaâs motion to dismiss some of the lawsuitâs claims, including its claim Metaâs alleged use of the authorsâ work to train AI violated the Digital Millennium Copyright Act, a US law introduced in 1998 to stop people from selling or duplicating copyrighted works on the internet. At the time, the judge agreed with Metaâs stance that the plaintiffs had not provided sufficient evidence to prove that the company had removed whatâs known as âcopyright management information,â like the authorâs name and title of the work.
The unredacted documents argue that the plaintiffs should be allowed to amend their complaint, alleging that the information Meta revealed is evidence that the DMCA claim was warranted. They also say the discovery process has unearthed reasons to add new allegations. âMeta, through a corporate representative who testified on November 20, 2024, has now admitted under oath to uploading (aka âseedingâ) pirated files containing Plaintiffsâ works on âtorrentâ sites,â the motion alleges. (Seeding is when torrented files are then shared with other peers after they have finished downloading.)
âThis torrenting activity turned Meta itself into a distributor of the very same pirated copyrighted material that it was also downloading for use in its commercially available AI models,â one of the newly unredacted documents claims, alleging that Meta, in other words, had not just used copyrighted material without permission but also disseminated it.
LibGen, an archive of books uploaded to the internet that originated in Russia around 2008, is one of the largest and most controversial âshadow librariesâ in the world. In 2015, a New York judge ordered a preliminary injunction against the site, a measure designed in theory to temporarily shut the archive down, but its anonymous administrators simply switched its domain. In September 2024, a different New York judge ordered LibGen to pay $30 million to the rights holders for infringing on their copyrights, despite not knowing who actually operates the piracy hub.
Metaâs discovery woes for this case arenât over, either. In the same order, Chhabria warned the tech giant against any overly sweeping redaction requests in the future: âIf Meta again submits an unreasonably broad sealing request, all materials will simply be unsealed,â he wrote.
