While Nvidia's data center GPUs are widely used to run artificial intelligence (AI) workloads, Alphabet's (GOOG -4.08%) (GOOGL -4.02%) Google has been working on its own custom AI chips for years. The company's Tensor Processing Unit, or TPU, was first announced in 2016. Its fourth-generation TPU, the TPU v4, became available for use by Google Cloud customers last year.
The benefit of an application-specific integrated circuit (ASIC) is that it can be designed at the hardware level to perform specific tasks. A GPU is more general purpose, making it useful for a wide variety of workloads in addition to AI, but potentially less efficient. Google claims that its TPU v4 beats Nvidia's last-gen A100 data center GPU on a variety of AI workloads.
Nvidia launched its ultra-power H100 data center GPU in 2022, and that hardware has become the standard for AI workloads. Nvidia is selling every H100 GPU it can make, and reports indicate that the company is set to triple production of data center GPUs in 2024. Right now, there's nothing faster for AI than the H100.
Efficiency matters
Training a large language model like the one that powers OpenAI's ChatGPT is a herculean task. For the training to be completed in a reasonable amount of time, thousands of powerful AI chips must be linked together. The initial cost of the AI chips and all the other equipment necessary to build what is essentially an AI supercomputer is enormous, and so is the cost to run it continually.
While raw power is important, efficiency is important as well. Google announced on Tuesday a brand-new iteration of its TPU focused on providing a balance between performance and efficiency. The TPU v5e is designed for both AI training and inference, and it's already available in preview for Google Cloud customers using the Google Kubernetes Engine platform.
Google claims that the TPU v5e provides twice the training performance per dollar and up to 2.5 times the inference performance per dollar compared to the TPU v4. With the costs to train the most advanced AI models quickly escalating, providing a cost-effective option could be a home run for Google.
Notably, the TPU v5e can be used in greater numbers for the most demanding AI training jobs. While the TPU v4 was limited to just over 3,000 chips for a single workload on Google Cloud, customers will be able to leverage tens of thousands of TPU v5e chips at once.
While Google's TPUs give its cloud customers a cost-effective way to run AI workloads, the company can't afford to ignore that Nvidia's H100 GPUs are in high demand. Along with the TPU v5e announcement, Google Cloud also unveiled its new A3 virtual machines powered by H100 GPUs. Each of these virtual machines features Intel's latest Xeon CPU paired with eight H100 GPUs. Tens of thousands of H100 chips can be used for a single workload, providing enough power for the most demanding AI tasks.
Boosting the cloud business
Google Cloud is not the first major cloud provider to launch virtual machines powered by Nvidia's H100 -- Amazon Web Services announced a similar product in July, and Microsoft Azure did the same earlier this month. However, the company is betting that cloud customers want options. Its efficient and cost-effective TPU-powered services could give it an edge as cloud providers scramble to win AI workloads.
Google Cloud has become an important business for Alphabet, generating $8 billion of revenue in the second quarter and, importantly, turning an operating profit. AI is one way Google Cloud can differentiate itself from the competition, particularly as the AI industry matures and starts caring a bit more about return on investment and how much it costs to train advanced AI models.