DeepSeek rushes to launch new AI model as China goes all in ...Middle East

News by : (Daily Sun) -

The Chinese startup triggered a $1 trillion-plus sell-off in global equities markets last month with a cut-price AI reasoning model that outperformed many Western competitors.

Deepseek had planned to release R2 in early May but now wants it out as early as possible, two of them said, without providing specifics.

DeepSeek did not respond to a request for comment for this story.

“The launch of DeepSeek’s R2 model could be a pivotal moment in the AI industry,“ said Vijayasimha Alilughatta, chief operating officer of Indian tech services provider Zensar. DeepSeek’s success at creating cost-effective AI models “would likely spur companies worldwide to accelerate their own efforts ... breaking the stranglehold of the few dominant players in the field,“ he said.

Little is known about DeepSeek, whose founder Liang Wenfeng became a billionaire through his quantitative hedge fund High-Flyer. Liang, who was described by a former employer as “low-key and introverted,“ has not spoken to any media since July 2024.

They told a story of a company that functioned more like a research lab than a for-profit enterprise and was unencumbered by the hierarchical traditions of China’s high-pressure tech industry, even as it became responsible for what many investors see as the latest breakthrough in AI.

Liang was born in 1985 in a rural village in the southern province of Guangdong. He later obtained communication engineering degrees at the elite Zhejiang University.

At DeepSeek and High-Flyer, Liang has similarly shunned the practices of Chinese tech giants known for rigid top-down management, low pay for young employees and “996” - working from 9 a.m. to 9 p.m. six days a week.

“Liang gave us control and treated us as experts. He constantly asked questions and learned alongside us,“ said 26-year-old researcher Benjamin Liu, who left the company in September. “DeepSeek allowed me to take ownership of critical parts of the pipeline, which was very exciting.”

While Baidu and other Chinese tech giants were racing to build their consumer-facing versions of ChatGPT in 2023 and profit off of the global AI boom, Liang told Chinese media outlet Waves last year that he deliberately avoided spending heavily on app development, focusing instead on refining the AI model’s quality.

COMPUTING POWER

The quant fund was an earlier pioneer in AI trading and a top executive said in 2020 that High-Flyer was going “all in” on AI by re-investing 70% of its revenue, mostly into AI research.

DeepSeek had not been established at that time, so the accumulation of computing power caught the attention of Chinese securities regulators, said a person with direct knowledge of officials’ thinking.

Authorities decided not to intervene, in a move that would prove crucial for DeepSeek’s fortunes: the U.S. banned the export of A100 chips to China in 2022, at which point Fire-Flyer II was already in operation.

Authorities had asked Liang to keep a low-profile because they were worried that too much hype in the media would draw unnecessary attention, the person said.

As one of the few companies with a large A100 cluster, High-Flyer and DeepSeek were able to attract some of China’s best research talent, two former employees said. “The key advantage of vast (computing) resources is that it allows for large-scale experimentation,“ said Liu, the former employee.

The startup used techniques like Mixture-of-Experts (MoE) and multihead latent attention (MLA), which incur far lower computing costs, its research papers show.

MLA architecture allows a model to process different aspects of one piece of information simultaneously, helping it detect key details more effectively.

DeepSeek’s pricing was 20 to 40 times cheaper than what OpenAI charged for equivalent models, analysts at Bernstein brokerage estimated in early February.

Adnan Masood of U.S. tech services provider UST told Reuters that his laboratory had run benchmarks that found R1 often used three times as many tokens, or units of data processed by the AI model, for reasoning as OpenAI’s scaled-down model.

Even before R1 gripped global attention, there were signs that DeepSeek had caught Beijing’s favor. In January, state media reported that Liang attended a meeting with Chinese Premier Li Qiang in Beijing as the designated representative of the AI sector, ahead of the leaders of better-known firms.

At least 13 Chinese city governments and 10 state-owned energy companies say they have deployed DeepSeek into their systems, while tech giants Lenovo, Baidu and Tencent - owner of China’s largest social media app WeChat - have integrated DeepSeek’s models into their products.

The Chinese embrace comes as governments from South Korea to Italy remove DeepSeek from national app stores, citing privacy concerns.

Further limits on advanced AI chips are a challenge that Liang has acknowledged.

“Our problem has never been funding,“ he told Waves in July. “It’s the embargo on high-end chips.”

Read More Details
Finally We wish PressBee provided you with enough information of ( DeepSeek rushes to launch new AI model as China goes all in )

Also on site :

Most Viewed News
جديد الاخبار