Google's Latest AI Chip Puts the Focus on Inference

By Timothy Green – Nov 9, 2025 at 6:42AM

Key Points

Google expects an explosion in demand for AI inference computing capacity.
The company's new Ironwood TPUs are designed to be fast and efficient for AI inference workloads.
With a decade of AI chip development under its belt, Google is well positioned for the age of inference.

NASDAQ: GOOG

Alphabet

Market Cap

$3.9T

Today's Change

(-0.05%) $0.16

Current Price

$320.12

Price as of November 28, 2025 at 1:00 PM ET

The seventh-generation TPU is an AI powerhouse for the age of inference.

Alphabet's (GOOG 0.05%)(GOOGL +0.06%) Google has been designing its own custom artificial intelligence (AI) accelerators for years. The company's Tensor Processing Units, or TPUs, are now on their seventh generation. Unlike Nvidia's GPUs, which are general purpose, Google's TPUs are application-specific integrated circuits designed specifically for artificial intelligence workloads.

Google announced on Thursday that Ironwood, its seventh-generation TPU, would be available for Google Cloud customers in the coming weeks. The company also disclosed that its new Arm-based Axion virtual machine instances are currently in preview, unlocking major improvements in performance per dollar. With these new cloud products, Google is aiming to lower costs for AI inference and agentic AI workloads.

Image source: Getty Images.

The dawn of the "age of inference"

While Google's Ironwood TPU can handle AI training tasks, which involve ingesting massive amounts of data to train an AI model, it's also well suited for high-volume AI inference workloads. "It offers a 10X peak performance improvement over TPU v5p and more than 4X better performance per chip for both training and inference workloads compared to TPU v6e (Trillium), making Ironwood our most powerful and energy-efficient custom silicon to date," according to Google's blog post announcing the upcoming launch of Ironwood.

While new AI models will still need to be trained, Google sees the balance shifting toward AI inference workloads. AI inference is the act of using a trained AI model to generate a response, and it's less computationally intensive than AI training workloads. However, AI chips meant for inference tasks need to have quick response times and be capable of handling a high volume of requests.

Google is calling the new era in the AI industry the "age of inference," where organizations shift focus from training AI models to using those models to perform useful tasks. Agentic AI, the current buzzword in the industry, is ultimately just a string of AI inference tasks. Google expects near-exponential growth in demand for compute as AI is increasingly put to use.

For AI companies like Anthropic, which recently signed a deal to expand its usage of Google's TPUs for both training and inference, efficiency is critical. Anthropic will have access to 1 million TPUs under the new deal, which will help it push toward its goal of growing revenue to $70 billion and becoming cash-flow-positive in 2028. The efficiency of Google's new TPUs were likely a key selling point.

NASDAQ: GOOG

Alphabet

Today's Change

(-0.05%) $-0.16

Current Price

$320.12

Key Data Points

Market Cap

$3863B

Day's Range

$316.94 - $326.88

52wk Range

$142.66 - $328.67

Volume

20M

Avg Vol

24M

Gross Margin

59.18%

Dividend Yield

0.26%

Powering Google Cloud growth

Google's cloud computing business has always lagged behind Microsoft Azure and Amazon Web Services, but AI could help the company catch up. Microsoft and Amazon are also aggressively building out AI computing capacity, and each also designs their own custom AI chips. Google Cloud, while smaller, is growing quickly and gaining ground on AWS.

In the third quarter, Google Cloud produced revenue of $15.2 billion, up 34% year over year. The business also produced operating income of $3.6 billion, good for an operating margin of roughly 24%. Meanwhile, AWS grew revenue by 20% to $33 billion in the third quarter, while Azure and other cloud services for Microsoft grew by 40%.

As more organizations move from experimenting with AI to deploying real AI workloads that demand significant amounts of AI inference capacity, Google is set to benefit with its massive fleet of TPUs. Google has been working on these chips for a decade, potentially giving it an edge as demand for AI computing capacity explodes.

About the Author

Tim Green is a contributing Motley Fool technology and consumer goods analyst covering companies in AI, cloud computing, retail, and other market sectors. Before The Motley Fool, Tim was in a doctoral program for computational physics. He holds a bachelor’s degree in physics from Rochester Institute of Technology.

TMFBargainBin

Key Points

NASDAQ: GOOG

Alphabet

The dawn of the "age of inference"

NASDAQ: GOOG

Key Data Points

Powering Google Cloud growth

About the Author

Premium Investing Services