ChatGPT Maker Suspects China’s Dirt Cheap DeepSeek AI Models Were Built Using OpenAI Data — and the Irony Is Not Lost on the Internet

Feb 21,25

OpenAI suspects that China's DeepSeek AI models, significantly cheaper than Western counterparts, were developed using OpenAI's data. This revelation, coupled with DeepSeek's rapid rise in popularity, triggered a significant market downturn for major AI companies. Nvidia, a key player in GPU technology crucial for AI, suffered the largest single-day stock loss in Wall Street history, losing nearly $600 billion in market capitalization. Other tech giants like Microsoft, Meta, and Alphabet also experienced substantial losses.

DeepSeek's R1 model, based on the open-source DeepSeek-V3, boasts significantly lower training costs (estimated at $6 million) compared to Western models. While this claim is disputed, it has fueled investor concerns about the massive investments being made by American tech companies in AI. The app's surge in downloads further highlights the impact of this cheaper alternative.

OpenAI and Microsoft are investigating whether DeepSeek violated OpenAI's terms of service by employing "distillation," a technique to extract data from larger models. OpenAI acknowledges that Chinese companies frequently attempt to replicate leading US AI models and is actively working with the US government to protect its intellectual property. David Sacks, President Trump's AI czar, supports OpenAI's claim, suggesting DeepSeek's actions constitute a knowledge extraction violation.

This situation highlights the irony of OpenAI's position, given its own past controversies. OpenAI's previous statements acknowledging the reliance on copyrighted material to train ChatGPT have been widely criticized. The company's claim that creating AI models like ChatGPT is impossible without copyrighted material is juxtaposed against its current accusations against DeepSeek. This has led to accusations of hypocrisy, particularly given ongoing lawsuits from the New York Times and 17 authors alleging copyright infringement. The legal landscape surrounding AI training data and copyright remains complex and highly contested.

DeepSeek is accused of using OpenAI’s model to train its competitor using distillation. Image credit: Andrey Rudakov/Bloomberg via Getty Images.

Copyright © 2024 godbu.com All rights reserved.