Alphabet's (GOOG 1.94%)(GOOGL 2.00%) Google has been designing its own custom artificial intelligence (AI) accelerators for years. The company's Tensor Processing Units, or TPUs, are now on their seventh generation. Unlike Nvidia's GPUs, which are general purpose, Google's TPUs are application-specific integrated circuits designed specifically for artificial intelligence workloads.
Google announced on Thursday that Ironwood, its seventh-generation TPU, would be available for Google Cloud customers in the coming weeks. The company also disclosed that its new Arm-based Axion virtual machine instances are currently in preview, unlocking major improvements in performance per dollar. With these new cloud products, Google is aiming to lower costs for AI inference and agentic AI workloads.
Image source: Getty Images.
The dawn of the "age of inference"
While Google's Ironwood TPU can handle AI training tasks, which involve ingesting massive amounts of data to train an AI model, it's also well suited for high-volume AI inference workloads. "It offers a 10X peak performance improvement over TPU v5p and more than 4X better performance per chip for both training and inference workloads compared to TPU v6e (Trillium), making Ironwood our most powerful and energy-efficient custom silicon to date," according to Google's blog post announcing the upcoming launch of Ironwood.
While new AI models will still need to be trained, Google sees the balance shifting toward AI inference workloads. AI inference is the act of using a trained AI model to generate a response, and it's less computationally intensive than AI training workloads. However, AI chips meant for inference tasks need to have quick response times and be capable of handling a high volume of requests.
Google is calling the new era in the AI industry the "age of inference," where organizations shift focus from training AI models to using those models to perform useful tasks. Agentic AI, the current buzzword in the industry, is ultimately just a string of AI inference tasks. Google expects near-exponential growth in demand for compute as AI is increasingly put to use.
For AI companies like Anthropic, which recently signed a deal to expand its usage of Google's TPUs for both training and inference, efficiency is critical. Anthropic will have access to 1 million TPUs under the new deal, which will help it push toward its goal of growing revenue to $70 billion and becoming cash-flow-positive in 2028. The efficiency of Google's new TPUs were likely a key selling point.

NASDAQ: GOOG
Key Data Points
Powering Google Cloud growth
Google's cloud computing business has always lagged behind Microsoft Azure and Amazon Web Services, but AI could help the company catch up. Microsoft and Amazon are also aggressively building out AI computing capacity, and each also designs their own custom AI chips. Google Cloud, while smaller, is growing quickly and gaining ground on AWS.
In the third quarter, Google Cloud produced revenue of $15.2 billion, up 34% year over year. The business also produced operating income of $3.6 billion, good for an operating margin of roughly 24%. Meanwhile, AWS grew revenue by 20% to $33 billion in the third quarter, while Azure and other cloud services for Microsoft grew by 40%.
As more organizations move from experimenting with AI to deploying real AI workloads that demand significant amounts of AI inference capacity, Google is set to benefit with its massive fleet of TPUs. Google has been working on these chips for a decade, potentially giving it an edge as demand for AI computing capacity explodes.