Sarah Fishman
Contributing Writer
A new development in the field of artificial intelligence (AI) has left Silicon Valley reeling. The Chinese start-up DeepSeek recently released a new language model called R1, and not only does it outperform many existing U.S. ones, but the cost of its creation was drastically cheaper.
What is DeepSeek, and what exactly did it just come out with?
DeepSeek is a small, private company in Hangzhou, China. Its founder, Liang Wenfeng, a former hedge fund trader, takes a unique approach to AI development. He sees confidence as crucial, prioritizing “intellectual exploration over sheer grind.” Liang employs young people to maximize technological innovation.
These avant-garde tactics have clearly paid off. Recently making No. 1 on Apple’s App Store, R1 has taken the tech world by storm, surpassing many U.S. models such as Google’s Gemini 2.0 Flash, Meta’s Llama 3.3-70B, and OpenAI’s GPT-4o according to the Artificial Analysis Quality Index.
R1 is a reasoning model, which is a type of large language model that can break down a problem into individual steps and therefore explain its own process — a significantly more “human-like” approach than past technology. International Business Machines Corporation Fellow Dr. Kush Varshney asserts that the ability of these models to check their own correctness indicates a kind of “meta cognition,” or awareness of their so-called thought process, which marks a big development in the field of AI.
This makes DeepSeek’s model a competitor of the prominent technological firm OpenAI’s model o1, which was released in September 2024 and boasts the same chain-of-thought process. Although OpenAI’s model remains ahead of DeepSeek’s in the Artificial Analysis Quality Index, the two are comparable, which is significant for two reasons.
1. DeepSeek’s R1 was built at a notably lower cost than OpenAI’s o1.
DeepSeek says it used a little over 2,000 Nvidia H800 GPUs — graphic processing units that “incorporate an extraordinary amount of computational capability,” used in machine learning to perform tasks like image recognition — to train the new model. The Chinese firm cited a cost of only $5.6 million, a stark difference from the costs incurred by other companies, which have “reportedly deployed 10,000 or more GPUs, and spent upwards of $100 million or more to get similar results.”
Although impressive, this figure may not be entirely accurate: While DeepSeek published the cost of R1’s final successful training run, the expense of its entire development, including all the model’s unsuccessful previous trials, is not included in that number.
Nonetheless, the dramatic difference remains significant. DeepSeek’s feat has drastically reduced the computational requirements for a reasoning model, which, according to scholar Marina Zhang, shows a new kind of innovation: “it’s basically based on algorithm optimization, using software to break through the constraints of not enough computational power.”
2. R1’s success has important geopolitical implications.
The battle between the U.S. and China for who leads AI innovation reflects a battle for global technological hegemony, a struggle fraught with complex power dynamics between the American government and the Chinese Communist Party (CCP).
In 2018, the Trump administration banned the export of key components for semiconductors to a Chinese telecommunications and chip-making company on the grounds of security interests. Relatedly, in 2022, the Biden administration banned the sale of certain microchips to China. Nvidia, a semiconductor company based in the U.S. that provides high-end chips for most American AI firms, subsequently created the H800 GPU, a less powerful chip that allowed them to continue selling to China without violating the ban.
This gave DeepSeek a critical window to continue acquiring American chips until the Biden administration banned the export of those too.
Antonia Hmaidi, senior analyst at the Mercator Institute for China Studies in Berlin, explains how DeepSeek’s use of a surprisingly low amount of these chips — not even the most cutting-edge technology — “has really shown to some degree how much the U.S. was living in a bubble.”
Is this a “Sputnik Moment?”
That’s what Marc Andreessen, entrepreneur and co-writer of the pioneering web browser Mosaic, called R1, alluding to the 1957 launch of a satellite named Sputnik by the Soviet Union during the Cold War’s Space Race.
Sputnik’s release forced the U.S. to realize that its aerospace abilities were not as superior as previously thought. This led to the creation of the Apollo Program, as the Americans increased funding for education, research, and innovation in an effort to catch up.
Like Sputnik, DeepSeek’s release has led the U.S. to question its technological advancement in the face of a significant achievement by a rival nation. And like the invigorated American response to Sputnik’s launch, U.S. firms are similarly resolved to rise to this challenge.
OpenAI CEO Sam Altman said on X: “We will obviously deliver much better models and also it’s legit invigorating to have a new competitor!”
Altman previously likened the AI development dynamics to “a race between democracy and authoritarianism … [warning] that the U.S. needs to maintain its position at the front of AI advances to prevent the technology from being misused.”
Insofar as this issue is a race between opposing forces, R1’s release is a “win” for China and a “loss” for the U.S. — or, at the very least, a moment of reckoning.
But AI development is not exactly a zero-sum game.
In 2023, Meta, a leading U.S. firm that owns Facebook, Instagram, and WhatsApp, made public the cutting-edge AI technology behind its model Llama — which it spent millions on.
This move was met with much criticism, with many claiming it “set a dangerous precedent because the chatbots could help spread disinformation, hate speech and other toxic content.”
Meta’s leaders, however, believed collaboration between experts would ultimately yield richer progress in this sector, as opposed to each company keeping their respective innovation to themselves. This software development method, called “open source,” played a role in DeepSeek’s development of R1, which used parts of the technology Meta made available.
Ragavan Srinivasan, Meta’s vice president, celebrates this result: “Our open source strategy was validated … The more people who have access to the technology needed to move things forward faster, the better.” The higher-ups at Meta see R1 as a win for small companies, which they believe can now hold their own against tech giants.
DeepSeek has followed suit, making R1 open source as well. This could yield progress for all, as their innovation appears to have brought down the extremely high price of AI development — which could possibly now be replicated.
Chief AI scientist at Meta, Dr. Yann LeCun, argues that DeepSeek’s feat reflects open source models overtaking proprietary ones, rather than China overtaking America. “Because their work is published and open source, everyone can profit from it,” he stated on LinkedIn.
DeepSeek’s feat could still pose national security concerns.
Issues raised include cybersecurity, intellectual property theft, data privacy, and more.
AI technology can be used to carry out advanced cyberattacks: “security experts warn that DeepSeek could be leveraged by Chinese intelligence services to conduct large-scale espionage, exploit zero-day vulnerabilities, and enhance China’s cyber warfare capabilities.” DeepSeek could theoretically help the CCP infiltrate American infrastructure, including financial and governmental systems.
Furthermore, there is concern that R1 may transmit user data to China Mobile, a state-controlled telecommunications company that the U.S. banned for national security threats, raising serious privacy concerns.
DeepSeek’s model also appears to defer to the Chinese government on certain issues, exhibiting possible censorship. For example, when asked about Taiwan’s independence (which the CCP claims sovereignty over), the model gave varied responses, one of which described the island as “an inalienable part of China’s territory.”
So, it’s a double-edged sword.
All of these complex factors go to say that the so-called race for AI development is not a clear-cut one. Open sourcing adds a collaborative aspect to the mix, which paints a more synergistic picture of this new field’s development. But important geopolitical factors keep the arena competitive — and it’s still anyone’s game.
DeepSeek’s use of American chips to build R1 indicates a key edge of the United States: despite the company’s innovative measures, computing power remains a critical component of progress.
The U.S. has an important lead over China in computational capability, and is maintaining the upper hand in this regard.