According to people acquaintance with this question, Jack Ma Ant Group Co. used semiconductors on Chinese production to develop AI training methods that would reduce the cost of 20%.
Ants used internal chips, including from the Alibaba Group Holding Ltd. And the Huawei Technologies Co., to prepare models, using the so -called mix of expert machine learning, people said. They received results similar to the results of Nvidia Corp. CHIPS, as H800, they said, asking him not to be named because the information is not public.
Ant Hangzhou Ant still uses Nvidia to develop AI, but now it relies mainly on an alternative, including from Advanced Micro Devices Inc. And Chinese chips for their latest models, said one of the people.
The models note the entry into the race between Chinese and US companies that accelerated with Deepseek, demonstrated how capable models can be prepared much less than the billions invested in Google Openai and Alphabet Inc. He emphasizes how Chinese companies are trying to use local alternatives to the most advanced Nvidia semiconductors. Although not the most advanced, the H800 is a relatively powerful processor and is currently banned by the United States from China.
The company has publishedResearch documentThis month, which demanded its models, sometimes exceeded Meta Platform Inc. In some landmarks that Bloomberg News is not self -tested. But if they work as advertised, Anti platforms could mark another step forward for the development of Chinese artificial intelligence, reducing the cost or support for AI services.
As the companies pour significant money into AI, Moe’s models have appeared as a popular option, having received recognition for the use of Google and Hangzhou Startup Deepseek, among others. This technique divides tasks into smaller data sets, very much as a team of specialists who focus on the work segment, making the process more effective. Ant declined to comment in an email statement.
However, training Moe usually relies on high -performance chips, such as Nvidia graphic units. Today, costs have been banned for many small firms and limited wider acceptance. Ant is working on ways to train more effectively and eliminate this restriction. Its name of the paper makes it clear, because the company aims to scale the model “without premium processors”.
This goes against the Nvidia grain. Jensen Chief Executive Director Juan claimed that demand for calculations would grow even with the advent of more effective models such as R1 Deepseek, position Companies will need the best chips to bring more revenue rather than cheaper to cut costs. It adheres to the strategy of creating large graphic processors with more processing of nuclei, transistors and increasing memory power.
Ant said it costs about 6.35 million yuan ($ 880,000) to train 1 trillion tokens using high -performance equipment, but its optimized approach would reduce to 5.1 million yuan using equipment with less specificity. Tokens are units of information that the model gets to learn about the world and give useful answers to users’ requests.
The company plans to use the recent breakthrough in the large language models it has developed, Ling-Plus and Ling-Lite, for industrial solutions, including health and finance, people said.
AntboughtChinese Internet Platform Haodf.com this yearbeef upHis artificial intelligence services in the health field. Ant created AI Doctor Assistant to support 290,000 HAODF doctors with tasks such as managing medical documents, according to a separate statement on Monday.
The company also has AI “Life Assistant” app called Zhixiaobao and Financial Advisory AI Service Maxiaocai.
In understanding English, Ant said in his work that the model Ling-Lite did better in a key benchmark compared to one of the models of Llama Meta. Both Ling-Lite models and Ling-Plus exceeded Deepseek’s equivalents on Chinese landmarks.
“If you find one attack point to win the best in the world of the Kung Fu master, you can still say that you have won them, and therefore in a real world supplement is important,” said Robin Yu, Chief Director-based technology based in Beijing, AI Shengsang Tech’s Shengshan Technology provider.
Ant made ling models open source. Ling-Lite contains 16.8 billion parameters that are adjustable settings that work as handles and kits to direct the model performance. Ling-Plus has 290 billion parameters, which is considered relatively large in language models. For comparison, experts estimated that GPT-4.5 chatgpt has 1.8 trillion parameters,In the hallPrior to the Mit technology review. Deepseek-R1has671 billion.
The company has faced problems in some areas of study, including stability. Even small changes in equipment or model structure have led to problems, including jumps in the speed of models, the newspaper said.
On Monday, Ant said he built large modeling machines, who used seven hospitals and health care workers in cities, including Beijing and Shanghai. The big model uses Deepseek R1, its own LLM QWEN and Ant Alibaba and can carry out medical advice, the message said.
The company also stated that it had rolled out two medical agents of the II – an angel who served more than 1000 medical institutions, and Yibaoer, which supports health insurance services. Last September, he launched AI Healthcare Manager’s service in Alipay.
Originally this story was presented on Fortune.com
https://fortune.com/img-assets/wp-content/uploads/2025/03/GettyImages-1782807556-e1742803689551.jpg?resize=1200,600
2025-03-24 08:15:00
Lulu Yilun Chen, Bloomberg