E-commerce and cloud computing giant Amazon.com (NASDAQ:AMZN) just announced that the popular Amazon Alexa digital assistant is running on Amazon's own hardware instead of chips designed by Nvidia (NASDAQ:NVDA). In a blog post aimed at Amazon Web Services (AWS) developers on November 12, technical evangelist Seb Stormarcq said that "the vast majority" of Alexa's machine learning workloads now run on Amazon's AWS Inferentia chips.
To be clear, nothing has changed in the Amazon Echo devices and other Alexa-powered gear you might buy for the holidays. The silicon shift happened on the back end of Alexa's services, where data is sent over to AWS cloud systems for final processing. Inferentia was explicitly designed to run neural network software, which is how Alexa learns how to interpret spoken commands.
According to Amazon's early tests, the new Inferentia clusters deliver the same results as Nvidia's T4 chips, but at 25% lower latency and 30% lower cost. The lower latency will allow Alexa developers to run more advanced analyses of the incoming data without leaving the user waiting for a slow calculation.
Amazon launched the Inferentia processor line two years ago, aiming to maximize processing speeds on the company's artificial intelligence workloads while also delivering cost savings by cutting out the middle man in the chip-designing process. The original designs came from Annapurna Labs, a specialized chip designer that Amazon acquired in 2015.
Alexa is not the first Amazon product to rely on the Inferentia-powered Inf1 AWS instances. Amazon's face recognition tool, Rekognition, is also moving over to Inf1 instances. AWS customers are also free to use Inf1 and Inferentia for their own projects. For example, Snapchat parent Snap (NYSE:SNAP), health insurance giant Anthem (NYSE:ANTM), and global publishing house Conde Nast are already using Amazon's Inferentia-based neural network instances to boost their artificial intelligence projects.