For anyone wanting to train an LLM on analyst responses to DeepSeek, the Temu of ChatGPTs, this post is a one-stop shop. We’ve grabbed all relevant sellside emails in our inbox and copy-pasted them with minimal intervention.
Backed by High-Flyer VC fund, DeepSeek is a two-years-old, Hangzhou-based spinout of a Zhejiang University startup for trading equities by machine learning. Its stated goal is to make an artificial general intelligence for the fun of it, not for the money. There’s a good interview on ChinaTalk with founder Liang Wenfeng, and mainFT has this excellent overview from our colleagues Eleanor Olcott and Zijing Wu.
Mizuho’s Jordan Rochester takes up the story . . .
[O]n Jan 20, [DeepSeek] released an open source model (DeepSeek-R1) that beats the industry’s leading models on some math and reasoning benchmarks including capability, cost, openness etc. Deepseek app has topped the free APP download rankings in Apple’s app stores in China and the United States, surpassing ChatGPT in the U.S. download list.
What really stood out? DeepSeek said it took 2 months and less than $6m to develop the model – building on already existing technology and leveraging existing models. In comparison, Open AI is spending more than $5 billion a year. Apparently DeepSeek bought 10,000 NVIDIA chips whereas Hyperscalers have bought many multiples of this figure. It fundamentally breaks the AI Capex narrative if true.
Sounds bad, but why? Here’s Jefferies’ Graham Hunt et al:
With DeepSeek delivering performance comparable to GPT-40 for a fraction of the computing power, there are potential negative implications for the builders, as pressure on Al players to justify ever increasing capex plans could ultimately lead to a lower trajectory for data center revenue and profit growth.
The DeepSeek R1 model is free to play with here, and does all the usual stuff like summarising research papers in iambic pentameter and getting logic problems wrong. The R1-Zero model, DeepSeek says, was trained entirely without supervised fine tuning.
Here’s Damindu Jayaweera and team at Peel Hunt with more detail.
Firstly, it was trained in under 3 million GPU hours, which equates to just over $5m training cost. For context, analysts estimate Meta’s last major AI model cost $60-70m to train. Secondly, we have seen people running the full DeepSeek model on commodity Mac hardware in a usable manner, confirming its inferencing efficiency (using as opposed to training). We believe it will not be long before we see Raspberry Pi units running cutdown versions of DeepSeek. This efficiency translates into hosted versions of this model costing just 5% of the equivalent OpenAI price. Lastly, it is being released under the MIT License, a permissive software license that allows near-unlimited freedoms, including modifying it for proprietary commercial use
DeepSeek’s not an unanticipated threat to the OpenAI Industrial Complex. Even The Economist had spotted it months ago, and industry mags like SemiAnalysis have been talking for ages about the likelihood of China commoditising AI.
That might be what’s happening here, or might not. Here’s Joshua Meyers, a specialist sales person at JPMorgan:
Further reading:
— Chinese start-ups such as DeepSeek are challenging global AI giants (FT)
— How small Chinese AI start-up DeepSeek shocked Silicon Valley (FT)

