ID 193420294 © Jackymkleung | Dreamstime.com
67993dd401cb4d2cb8b6e325 Cloud Dreamstime L 193420294

DeepSeek and the Challenges of Machine-Learning Investment

Jan. 28, 2025
Electronic Design’s Editor Bill Wong weighs in on the DeepSeek announcement.

What you’ll learn:

  • What is DeepSeek-V3?
  • Why DeepSeek’s announcement is significant.
  • Why DeepSeek’s announcement doesn’t eliminate the need for AI hardware.

 

DeepSeek’s recent DeepSeek-V3 announcement caused major fluctuations in the stock market. In this case, though, Billy Ray Valentine’s quote from the movie Trading Places is appropriate:

“I’d wait till you get to around sixty-four, then I’d buy. You’ll have cleared out all the suckers by then.”— Billy Ray Valentine, Trading Places

The stock market drop was almost across the board for artificial-intelligence/machine-learning (AI/ML) hardware and software companies, which was surprising as the announcement was all about software. It highlights a generative-AI, large language model (LLM) akin to ChatGPT, called DeepSeek-V3. The paper associated with the announcement notes the model was trained using NVIDIA boards, but not top-of-the-line hardware.

As noted in DeepSeek’s technical paper abstract about DeepSeek-V3, “We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2.”

The significance is the ability to train the system on their hardware, which isn’t as powerful as that used by companies like OpenAI. DeepSeek is operationally comparable to ChatGPT, but comparing the functionality and performance between the latest incarnations of each generative-AI system in the market is really hard to do. It’s not like making a benchmark for testing microcontroller performance, something that’s a hard-enough task on its own.

How Did DeepSeek Improve the Training Performance?

DeepSeek-V3 is a very functional, open-source system—trained using a variety of optimizations—that can be applied to many LLMs.

For example, as noted in the technical paper, “NVLink offers a bandwidth of 160 GB/s, roughly 3.2 times that of IB (50 GB/s). To effectively leverage the different bandwidths of IB and NVLink, we limit each token to be dispatched to at most 4 nodes, thereby reducing IB traffic.”

>>Check out this TechXchange for similar articles and videos

Ruslan Batiuk | Dreamstime
Generative Ai Tech Xchange Promo Ruslan Batiuk Dreamstime 271129759
Machine Learning

TechXchange: Generating AI

Generative artificial intelligence (AI) like chatbots are changing the way many use AI.

This is just one of many ways the developers got more work out of the hardware they had available. The paper goes into the details, and it’s not just one tweak that made a difference. Essentially, the training software was tuned for the available hardware. Switch to faster, more powerful hardware and the training time drops further.

While AI/ML hardware has been improving significantly over time, the software has made even more performance improvements. Past AI/ML improvements, such as using sparse matrices and smaller weight sizes, sometimes resulted in orders-of-magnitude improvements, but these approaches were generally adopted by the industry. DeepSeek’s enhancements fall into this category as well.

Do We Still Need to Improve AI/ML Hardware?

The simple answer is yes. Those massive data centers being built to run AI/ML applications will be used regardless of their performance. Things will work faster, or more operations can be done, with higher-performance hardware.

At this point, AI/ML accelerators are now being employed, with more coming online and being designed as you read this. These usually provide advantages compared to things like GPGPUs and FPGAs because they have been optimized for some desired functionality. A mix of all of these will exist in the cloud, down to embedded and mobile devices like the latest smartphones that can run LLMs locally.

However, one main question lingers for developers: What can be squeezed into the target hardware and associated software? Developers train their models in the cloud but not exclusively. Some use local servers and data centers, while others even resort to their desktop or laptop PCs. Changes like those introduced by DeepSeek simply make lower-end platforms more useful for larger models than would otherwise be possible.

What About Investing in AI/ML Companies?

I’m not about to give any recommendations on investing, but those who are considering this task should get a better understanding of the technology used by these companies.

Hardware and software technology improvements can be incremental or go through major shifts. Investors need to evaluate how companies are creating and using these technologies with respect to their end product. Generative AI and LLMs are just tools that require infrastructure to be useful. It might be hardware like a smartphone, or the cloud being accessed by an app on the smartphone.

Improvements such as DeepSeek-V3 help squeeze more out of available hardware. This doesn’t eliminate the need for better hardware, and, of course, new software and models are under development. Generative AI is relatively new, but it’s not the end of the road for AI/ML.

>>Check out this TechXchange for similar articles and videos

Ruslan Batiuk | Dreamstime
Generative Ai Tech Xchange Promo Ruslan Batiuk Dreamstime 271129759
Machine Learning

TechXchange: Generating AI

Generative artificial intelligence (AI) like chatbots are changing the way many use AI.
About the Author

William G. Wong | Senior Content Director - Electronic Design and Microwaves & RF

I am Editor of Electronic Design focusing on embedded, software, and systems. As Senior Content Director, I also manage Microwaves & RF and I work with a great team of editors to provide engineers, programmers, developers and technical managers with interesting and useful articles and videos on a regular basis. Check out our free newsletters to see the latest content.

You can send press releases for new products for possible coverage on the website. I am also interested in receiving contributed articles for publishing on our website. Use our template and send to me along with a signed release form. 

Check out my blog, AltEmbedded on Electronic Design, as well as his latest articles on this site that are listed below. 

You can visit my social media via these links:

I earned a Bachelor of Electrical Engineering at the Georgia Institute of Technology and a Masters in Computer Science from Rutgers University. I still do a bit of programming using everything from C and C++ to Rust and Ada/SPARK. I do a bit of PHP programming for Drupal websites. I have posted a few Drupal modules.  

I still get a hand on software and electronic hardware. Some of this can be found on our Kit Close-Up video series. You can also see me on many of our TechXchange Talk videos. I am interested in a range of projects from robotics to artificial intelligence. 

Sponsored Recommendations

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!