Advertisement
Tech

‘Hypocritical babies’: OpenAI’s myriad lawsuits flagged after accusing DeepSeek of intellectual property theft

People are delighting in the irony

Photo of Nate Wolf

Nate Wolf

The OpenAI logo on a red tech background next to the DeepSeek logo on a blue tech background.
Shutterstock; OpenAI: DeepSeek

Online observers are finding it difficult to muster sympathy for ChatGPT creator OpenAI and its partner Microsoft after the companies launched an investigation into whether upstart Chinese competitor DeepSeek stole data to train artificial intelligence models.

Featured Video

DeepSeek sent the tech industry and financial markets spiraling this month with the release of its supposedly low-cost AI model called R1. But the Financial Times reported today that OpenAI has evidence a group linked to DeepSeek may have violated its terms of service preventing users from taking “output to develop models that compete with OpenAI.”

Readers were quick to point out OpenAI’s own lengthy history of using online content to build its AI products—a practice that led dozens of authors, artists, and news publishers to sue the company for copyright infringement. 

“I’m so sorry I can’t stop laughing,” one commentator wrote on X next to screenshots from the Financial Times piece. “OpenAI, the company built on stealing literally the entire internet, is crying because DeepSeek may have trained on the outputs from ChatGPT. They’re crying their eyes out. What a bunch of hypocritical little babies.”

Advertisement

“oooohhh improperly obtained data you say?” one post joked, racking up more than 10,000 views. 

“Pot, I’d like to introduce you to kettle,” another user responded. 

The schadenfreude was just as strong on Reddit’s r/technology forum. 

“AI company making billions by stealing other people’s work without compensation or credit complains about having work stolen,” one commenter wrote to the tune of 16,000 upvotes. 

Advertisement

In one ongoing case, the New York Times, the New York Daily News, and the Center for Investigative Reporting sued OpenAI and Microsoft in federal court for lifting reams of copyrighted articles from the publishers without their consent and using the data to train ChatGPT. 

Attorneys for the two tech companies argued that these practices are perfectly legal. 

OpenAI now faces unexpected pressure from DeepSeek, too, but some observers in the tech world say America’s AI standard bearer shouldn’t be so surprised. 

“I’m a bit confused by this – didn’t DeepSeek openly say they used synthetic data (as in LLM generated data) in their training?” one user asked on the r/OpenAI subreddit, referring to DeepSeek’s self-confessed use of data produced by existing AI models. “I kind of assumed that some of that would have been generated by OpenAI models anyway.”

Advertisement

Whether that practice violates OpenAI’s terms of service—and how legally dubious or enforceable the violation may be—appears to be an open question. But how OpenAI got to this position isn’t much of a mystery to people online.  

As one user on Reddit put it: “OpenAI trained its model using copyrighted material, and now their results are all over the internet.”


Internet culture is chaotic—but we’ll break it down for you in one daily email. Sign up for the Daily Dot’s web_crawlr newsletter here. You’ll get the best (and worst) of the internet straight into your inbox.

Advertisement
 
The Daily Dot