Explore the Latest in Smart Tech

DeepSeek reportedly failed to utilize its primary model following a $294,000 investment

Unexplired GPU hours worth approximately 2.79 million, instrumental to the initial phase of R1 training, were omitted from the costs outlined in the R1 training report.

, and Administrator

2025 September 29 . 4:04 PM

2 min read

DeepSeek's prominent model apparently didn't undergo training with the substantial sum of $294,000.

DeepSeek reportedly failed to utilize its primary model following a $294,000 investment

In a groundbreaking move, Chinese AI company DeepSeek has published a research report in the prestigious journal Nature, detailing their latest model, DeepSeek V3. The report has caused a stir in the AI community, with some claiming that the model was substantially cheaper and more efficient to train than Western counterparts. However, a closer look at the facts reveals a more nuanced picture.

The initial $300,000 figure, which has been widely publicised, does not account for the entire end-to-end model training cost. This figure is based on the assumption that H800 GPUs could be rented for $2/hr. In reality, the true cost to train the model was at least 20 times that initial estimate.

The DeepSeek V3 model, while larger than Llama 4 Maverick, used significantly fewer training tokens at 14.8 trillion. This is an impressive feat, considering that Llama 4 required between 22 and 40 trillion tokens and was trained on between 2.38M and 5M GPU hours.

In terms of compute, DeepSeek V3 and R1 are roughly comparable to Meta's Llama 4. The DeepSeek V3 model was trained on 2,048 H800 GPUs for approximately two months. This training process also included about 5,000 GPU hours to generate supervised fine-tuning datasets. The total model required 2.79 million GPU hours at an estimated cost of $5.58 million.

It's important to note that the purchase cost of the 256 GPU servers used to train the models is estimated to be over $51 million. This cost, however, is not included in the initial $300,000 figure.

The confusion arose from supplementary information released alongside the original January paper, which stated the AI model used 64 eight-way H800 boxes totaling 512 GPUs for training. This led some to believe the cost was only $294,000 USD.

The paper focuses on the application of reinforcement learning to imbue the existing V3 base model with 'reasoning' or 'thinking' capabilities. The reinforcement learning process is a post-training process that typically involves reinforcing stepwise reasoning by rewarding models for correct answers. It's worth noting that the researchers had already completed about 95 percent of the work before reaching the reinforcement learning phase detailed in the paper.

In conclusion, while DeepSeek V3's training cost is significantly lower than that of Western models, the initial $300,000 figure does not tell the whole story. The actual cost of the DeepSeek models may be higher due to factors like research and development, data acquisition, data cleaning, and potential false starts or wrong turns. Nevertheless, DeepSeek's achievement is a testament to the progress being made in the AI field, particularly in regions outside of the traditional AI powerhouses.

Latest

In this image, we can see an advertisement contains robots and some text.

Finance

UBA's Role in Consumer Protection: Enforcing EU Regulations Against Unfair Practices

The UBA's 'VS' unit works closely with European authorities to protect consumers' collective economic interests. It conducts market checks and enforces regulations, ensuring businesses meet legally prescribed criteria.

, and Administrator

2025 October 9

Smart-home-devices

Swatch & Omega Launch Limited MoonSwatch: A Hunter's Moon Homage

Get ready for a unique timepiece! The MoonSwatch, a collaboration between Swatch and Omega, is a deep blue Bioceramic watch with a moon phase display and special Snoopy illustrations, available for a limited time only.

, and Administrator

2025 October 9

In this picture we can see a web page, in the web page we can find some text and a machine.

Industry

Optus Data Breach Exposes 11.2M Customers, 3.66M Licence Numbers

Optus' API vulnerability led to a massive data leak. Now, 11.2 million customers face potential identity theft.

, and Administrator

2025 October 9

This is a presentation and here we can see vehicles on the road and we can see some text written.

Automotive

Porsche's Cayenne Electric: High-Performance SUV Arrives by End of 2025

Porsche's first electric SUV promises stunning power and range. The Cayenne Electric is ready to take on the world, both on and off-road.

, and Administrator

2025 October 9

DeepSeek reportedly failed to utilize its primary model following a $294,000 investment

DeepSeek reportedly failed to utilize its primary model following a $294,000 investment

Read also:

Related

Latest