Lawsuits are never exactly a lovefest, but the copyright fight between The New York Times and both OpenAI and Microsoft is getting especially contentious. This week, the Times alleged that OpenAIās engineers inadvertently erased data the paperās team spent more than 150 hours extracting as potential evidence.
OpenAI was able to recover much of the data, but the Timesā legal team says itās still missing the original file names and folder structure. According to a declaration filed to the court Wednesday by Jennifer B. Maisel, a lawyer for the newspaper, this means the information ācannot be used to determine where the news plaintiffsā copied articlesā may have been incorporated into OpenAIās artificial intelligence models.
āWe disagree with the characterizations made and will file our response soon,ā OpenAI spokesperson Jason Deutrom told WIRED in a statement. The New York Times declined to comment.
The Times filed its copyright lawsuit against OpenAI and Microsoft last year, alleging that the companies had illegally used its articles to train artificial intelligence tools like ChatGPT. The case is one of many ongoing legal battles between AI companies and publishers, including a similar lawsuit filed by the Daily News being handled by some of the same lawyers.
The Timesā case is currently in discovery, which means both sides are turning over requested documents and information that could become evidence. As part of the process, OpenAI was required by the court to show the Times its training data, which is a big dealāOpenAI has never publicly revealed exactly what information was used to build its AI models. To disclose it, OpenAI created what the court is calling a āsandboxā of two āvirtual machinesā that the Timesā lawyers could sift through. In her declaration, Maisel said that OpenAI engineers had āerasedā data organized by the Timesā team on one of these machines.
According to Maiselās filing, OpenAI acknowledged that the information had been deleted, and attempted to address the issue shortly after it was alerted to it earlier this month. But when the paper’s lawyers looked at the ārestoredā data, it was too disorganized, forcing them āto recreate their work from scratch using significant person-hours and computer processing time,ā several other Times lawyers said in a letter filed to the judge the same day as Maiselās declaration.
The lawyers noted that they had āno reason to believeā that the deletion was āintentional.ā In emails submitted as an exhibit along with Maiselās letter, OpenAI counsel Tom Gorman referred to the data erasure as a āglitch.ā