The cascade effect of DeepSeek
An examination of DeepSeek鈥檚 success and how we believe it energizes Chinese innovation and changes the future of the global AI race
header.search.error
An examination of DeepSeek鈥檚 success and how we believe it energizes Chinese innovation and changes the future of the global AI race
Investors are riding the wave of optimism brought on by the breakout success of DeepSeek鈥檚 low cost, high performing AI model. Will it continue? TJ breaks down the phenomenon from different angles.
Chinese technology startup DeepSeek鈥檚 rapid ascendance brings the latest drama to the global generative artificial intelligence (AI) race 鈥 and a whole lot of volatility to world stock markets in recent months. The launch of R1, DeepSeek鈥檚 powerful open source, large language model (LLM), has been billed as a threat to the pioneering OpenAI鈥檚 ChatGPT and American dominance in AI innovation. R1鈥檚 ability to solve complex problems rivals OpenAI鈥檚 latest models, and this incredible breakthrough was accomplished at a fraction of the typical cost of developing a high performance LLM.
Why spend billions of dollars in developing and training advanced, proprietary LLMs, when a much cheaper version built and optimized on existing LLMs works almost just as well? This is just one of the many things the DeepSeek breakout called into question. Can R1鈥檚 success be readily replicated by other models? Does this mean China will overtake the US as the innovation superpower of the 21st century?
Assessing the DeepSeek phenomenon on technical merit is logical; however, weighing its impact on policy and the domestic sector is equally important. Likely prompted by the startup鈥檚 success, in a rare symposium appearance President Xi met with business leaders from tech companies and more last month. The meeting not only signaled renewed government support for the sector, but it was also a recognition of the next generation of entrepreneurs, who will drive the economy forward.
In this supportive environment, will the China tech stock rally continue? Since the launch of ChatGPT in late 2022, the US tech sector has gained over 80% but by comparison the China tech sector had a more muted performance amid a multi-year regulatory crackdown; that is until now.1 Chinese tech stocks rallied 18% from end of January to end of February.2 Could this lead to a sustainable revaluation for Chinese equities overall? Is this the turning point for Chinese economy that we have long waited for? We have a few thoughts.
Knowledge distillation
When knowledge is transferred from a large model to a smaller one, the process is called knowledge distillation. The R1 model is distilled from a number of existing third-party models, which as user experience reports indicate, included distillation from OpenAI's o1 models, and then fine-tuned on DeepSeek data. While the o1 model outperforms R1 in many tasks, the DeepSeek model is not far behind. Chatbot Arena uses crowdsourced data to rate AI models, and RI ranks high on overall performance, while outscoring most other models on tasks such as math and coding.
We take pride in DeepSeek鈥檚 success, and what it could mean for China tech. Not only did R1 surpass ChatGPT as the most downloaded app on the iOS App Store in the US at the end of January, it registered 30 million users worldwide, which set a record.3 It is a showcase of China鈥檚 homegrown innovation, made possible by top-tier engineering talent to rival Silicon Valley. That said, R1 is still a 鈥榝ast follow鈥 of o1.
Although knowledge distillation lowers replication costs of advanced LLMs, it limits the model's ability to handle complex or novel tasks. Distillation is not the best approach for developing artificial general intelligence (鈥淎GI鈥) capabilities in the long term.
Reaching for intelligence
NVIDIA and OpenAI (and their backers) still hold the key to AGI, if attainable, based on our primary research and understanding. With DeepSeek鈥檚 success, OpenAI is implementing measures to safeguard its internet protocol (IP), such as feeding low-quality tokens to suspicious application programming interface (API) users.
The American company has previously warned about competitors utilizing distillation but recognizes the competition as invigorating, prompting a review of its product strategies and open source policies. It would not surprise us if OpenAI decides to offer API services of its frontier models exclusively to the world鈥檚 largest technology companies in the future to mitigate IP replication risks. Either way, relying on knowledge distillation from closed-end models will not deliver a sustainable advantage, but there are many other technical factors that make R1 remarkable.
Short cuts and techniques
Because of chip constraints, DeepSeek had to be creative and resourceful with R1鈥檚 architecture so it can do more with less. To cut down on the data processing needed to train the model, it invented short cuts and adopted techniques used by other Chinese AI companies.
R1 is trained on a process called 鈥渞einforcement learning,鈥 which means the model receives feedback on its actions, and as it learns, responses improve. This method is less resource intensive than some of the more traditional approaches. The feedback is rated on its own rule-based reward systems 鈥 a combination of reward signals and different types of user prompt formats 鈥 to better capture human preferences in complex and nuanced scenarios. This process decides the optimization direction, looking to deliver a model that excels in reasoning and at the same time is helpful and harmless.
Additionally, a traditional LLM like OpenAI鈥檚 GPT-4 has about 1.8 trillion adjustable settings called 鈥減arameters,鈥 when it processes information and makes decisions on how to respond to a prompt.4 But R1 operates on much fewer parameters at about 670 billion, and it keeps only a small portion of those parameters active during any single operation.4
What lets R1 get more out of much less is a technique known as 鈥渕ixture of experts鈥. It splits up the model into different areas of expertise, so when a user prompt comes in, only the sections with the relevant specialties would be called up to process and respond. Because not the entire network is at work, less computing power is needed.
Invest in the future
As mentioned, the DeepSeek story is not only a technical one, and we should not ignore the extraordinary impact it has had on policymakers and other tech companies in such a short space of time.
Last month鈥檚 closed-door symposium between President Xi and a group of entrepreneurs from key industries received a lot of media coverage and investor attention, with its invitee/attendee list scrutinized intensely. While the world was fixated on Jack Ma鈥檚 attendance, we believe the invitation of Xingxing Wang, the founder of Unitree Robotics, sent a stronger message; Unitree specializes in quadruped robots, and Wang represents the new generation of entrepreneurs. The government鈥檚 invitation is a show of recognition and support to those who will drive the country鈥檚 future growth.
Good news also came from tech giant Alibaba. The company announced ambitious plans to work towards AGI, in the next three years investing severalfold of that over the last ten. This bold statement 鈥 reminiscent of Magnificent Seven overtures 鈥 tells us Alibaba is again focusing on the longer term, for the first time since 2021.
Observations on the ground
Also surprising to us is the velocity of AI penetration across households and businesses. In a fast cascading fashion, there has been a strong uptake of services provided by Ali Cloud and Tencent Cloud by new clients since January, many of whom are small and medium enterprises across different industries. This is consistent with conversations we have with companies during recent visits to Yiwu and Hangzhou. Many planned to increase capital expenditure and looked to leverage DeepSeek to make their existing workflow more efficient.
Global AI race
Currently, US tariffs and chip export restrictions on China are less severe than initially feared. Regardless, the competition between China and the US appears structural and seems likely to persist for several years, even though the timing and nature of potential future conflicts remain uncertain.
The question for us isn鈥檛 really who will be the innovation superpower, but what progress will this race bring to the world? There will inevitably be winners and losers, and the competition should provide plenty of long-term alpha-driven opportunities in both global and domestic companies for relative value investors like us.
In the end, we are excited that DeepSeek refocuses investor attention on Chinese AI innovation. But let鈥檚 not forget there are compelling opportunities outside of tech. Even with the strong run recently, in our view, the Chinese equity market still has a long runway ahead. We think the much-awaited turnaround is here.
Fill in an inquiry form and leave your details 鈥 we鈥檒l be back in touch.
Whether you have a question or a request, we will be happy to get in touch with you. Contact our 蜜豆视频 Asset Management team for more details.
Meet the members of the team responsible for 蜜豆视频 Asset Management鈥檚 strategic direction.