Intel (INTC 1.26%) has largely missed out on the booming market for AI accelerators. The company's Gaudi line of AI chips, which came from its acquisition of Habana Labs in 2019, failed to gain any real traction, partly due to immature software and an unfamiliar architecture. Falcon Shores, which was meant to be a more traditional data center AI GPU, was scrapped as a commercial product earlier this year.
While Intel is likely too late to compete in the market for AI accelerators used for training models, AI inference is another story. While AI training workloads require massive computational horsepower and data throughput, AI inference workloads aren't nearly as demanding. As AI models and AI agents are deployed to tackle a growing number of use cases, the need for efficient and affordable AI inference chips is on the rise.

Image source: Intel.
Getting back into the AI game
Intel announced a brand-new AI GPU on Tuesday at the 2025 OCP Global Summit. Unlike its previous efforts, Intel's upcoming GPU will be solely focused on AI inference.
"AI is shifting from static training to real-time, everywhere inference -- driven by agentic AI," said Intel CTO Sachin Katti. "Intel's Xe architecture data center GPU will provide the efficient headroom customers need -- and more value -- as token volumes surge."
Code-named Crescent Island, Intel's new AI GPU will be based on the Xe3P architecture, which is an enhancement of the Xe3 architecture that will be used in the company's Panther Lake PC CPUs. The GPUs will be optimized for performance-per-watt and feature 160GB of LPDDR5X memory. Intel says that Crescent Island is ideal for "tokens-as-a-service" providers that charge by the token for AI models, as well as for general AI inference workloads.
Crescent Island is still in the works, and the company doesn't expect to sample the product with customers until the second half of 2026.
A second chance for Intel
While Nvidia dominates the market for GPUs used for AI training, the AI inference market is more competitive because the latest-and-greatest chips aren't necessarily required. For many agentic AI applications, smaller models that are cheap and fast to run are useful for certain types of tasks. Performance is still important for the GPU running these models, but efficiency is as well.
Cloudflare represents an example of this concept in action. The company offers AI inference services through its Workers platform, and because it focuses solely on inference and smaller AI models, it can get away with using older GPUs that are less expensive to install.
By 2030, the AI inference market is expected to more than double in size to over $250 billion, according to estimates from MarketsandMarkets. While Intel can't compete with Nvidia in the AI training market, it still has an opportunity to emerge as a winner in the AI inference market by focusing on efficiency. For AI inference providers that charge by the token, GPUs that maximize performance per watt should be appealing as a way to bring down costs.
While Intel's new AI GPU looks promising, a lot could change over the next 18 months. Sampling to customers is still about a year away, so real revenue probably isn't coming until 2027. Given how fast the AI industry is evolving, and the risk that a bubble may be brewing in the AI infrastructure market, Intel could end up late for the party once again.