Training advanced artificial intelligence (AI) models requires an enormous investment in hardware. High-powered servers, equipped with tens of thousands of GPUs and untold amounts of memory, are required to churn through the gargantuan amount of training data that gives any AI model its smarts.
GPT-4, the latest large language model from OpenAI, reportedly cost more than $100 million to train on trillions of words of text. Raw compute power is critical for this process, and Nvidia's ultra-powerful GPUs fill that role, but networking is also important. Data needs to be moved quickly to feed all the powerful processors doing the AI training.
Cisco makes an AI chip
Cisco Systems (CSCO 1.33%) is the market leader in enterprise switches and routers, providing hardware and software that connects servers together and connects data centers to the internet. The company also designs networking chips, catering to companies needing more control over their networks.
Meta Platforms, for example, deployed Cisco's Silicon One chips back in 2021. These chips provide the foundation upon which hyperscale data centers can be built.
Megacompanies like Meta need enormously fast networks to handle a deluge of data, but AI training is even more demanding. Cisco recently detailed its latest line of Silicon One chips, which include the G200 and G202, and they're squarely aimed at enabling artificial intelligence workloads. These chips are built on a 5nm process, and the fastest of the two provides a whopping 51.2 Tbps throughput.
According to Cisco, a cluster with 32,000 GPUs requires one-third fewer networking layers with its new networking chips. This not only reduces complexity, but also reduces power usage. The chips themselves are also twice as power efficient as their predecessors and come with a variety of features to reduce latency and improve overall performance.
Competition and opportunity
Cisco isn't the only game in town when it comes to advanced networking chips aimed at AI workloads. Broadcom unveiled a similar product called Jericho3-AI in April, which also promises to enable linking together 32,000 GPUs in a single cluster. Marvell detailed its own 3nm AI networking chip in April, as well.
Global spending on AI, including software, hardware, and services, is expected to reach $154 billion this year, according to IDC. By 2026, IDC sees this number climbing to $300 billion. While only a fraction of this spending will go toward networking technologies, AI networking chips are likely a multibillion-dollar opportunity for Cisco and its peers.
The need to train specialized AI models should only grow as companies look to leverage the technology. While something like ChatGPT from OpenAI is extremely powerful, it's also generic. A company gains no competitive advantage by tossing ChatGPT into its products and calling it a day.
A proliferation of custom AI models, trained with proprietary data and customized for specific use cases, will drive demand for AI training services. That, in turn, will drive demand for chips like Cisco's latest Silicon One products.
While Cisco's AI networking chips don't do the actual training like Nvidia's powerful GPUs, they'll become increasingly essential as the AI industry evolves.