Please ensure Javascript is enabled for purposes of website accessibility

Lost in Translation: Ford Teams With Nuance Communications to Master Human Language

By Xconomy .com – Updated Apr 6, 2017 at 8:54PM

You’re reading a free article with opinions that may differ from The Motley Fool’s Premium Investing Services. Become a Motley Fool member today to get instant access to our top analyst recommendations, in-depth research, investing resources, and more. Learn More

Ford can't force people to adjust their speech to use its commands. Instead, Ford wants to figure out how to interpret natural human speech.

"Call John Smith."

"I wanna call John Smith."

At first glance, these sentences look pretty similar. But try telling that to the voice recognition technology behind Ford Motor's (NYSE: F) SYNC system. You might as well be speaking Greek.

Voices recognition software has come a long way in recent years. Google's (Nasdaq: GOOG) Android platform, for example, allows users to search for information by speaking into their smartphones. But mastering the subtleties of human language remains beyond the reach of even our most sophisticated technology. (Remember how the IBM supercomputer Watson was kicking some serious butt on Jeopardy! until the final round, when it answered "Toronto" to a question about U.S. cities?)

Voice commands are a key component to Ford's SYNC system; the company, based in Dearborn, MI, promotes SYNC as a safety feature because it allows drivers to do stuff without taking their hands off the wheel.

With that in mind, Ford recently said it will partner with the appropriately named Nuance Communications, based in Burlington, MA, to develop software that can not only recognize specific words/phrases but the intent of the person speaking them.

Normally, SYNC relies on what's called "structured commands." The company basically records phrases that drivers must speak in order for the car to execute their wishes.

There are two problems with technique. First, there are an awful lot of commands. Drivers today can order their cars to do everything from make phone calls and find directions to play music and adjust the cabin temperature.

Secondly, Ford is only really guessing what people will say, which, more often than not, is not what they will actually say. For instance, Ford initially programmed SYNC to recognize the command "Play Tracks." Unless you work in the music business, you probably don't even know what a track is. A better command would be "Play Songs."

"You can't stop someone from saying something," says Brigitte Richardson, Ford's lead engineer on its global voice control technology/speech systems.

In other words, Ford can't force people to adjust their speech to use its commands. People are going to speak how they are going to speak.

Working with Nuance, Ford wants to develop software based on more advanced algorithms called "statistical language modeling" (SLM).

The concept, first developed in the 1980s, estimates the probability of how people will group together words, phrases, and sentences according to their natural speech patterns. For Nuance, the company specifically wants to organize the words into "semantic classifications" of meanings. Based on that work, Nuance is developing an "inference engine" that can learn, understand, and interpret voice commands, says Ed Chrumka, senior product manager of connected car services.

Thus, the car will better match the driver's actual words to their actual intent. In theory, a car will learn a driver's linguistic habits so eventually there will be no difference between "Call John Smith," (the original command the car recognizes) and "I wanna call John Smith," a phrase the driver is more likely to use.

SLM is "a totally different way of doing things," Richardson says. "It wants to incorporate more natural ways of talking."

But SLM is easier said than done (pun intended). Human language is complex and broad and entails an almost infinite number of combinations of phrases, words, and sentences. For a computer, recognizing words is the easy part. Figuring out the speaker's actual meaning is a whole different ball game.

"Perhaps the most frustrating aspect of statistical language modeling is the contrast between our intuition as speakers of natural language and the over-simplistic nature of our most successful models," Ronald Rosenfeld, a computer science professor at Carnegie Mellon University, wrote in a research paper. "As native speakers, we feel strongly that language has a deep structure. Yet we are not sure how to articulate that structure, let alone encode it, in a probabilistic framework."

Ford's job is even tougher. While Google's voice recognition technology can draw upon the vast resources of cloud-based Internet servers, Ford wants to program its speech software onto a single, self-contained chip installed in the car.

Relying on outside servers will force drivers to spend more money on data plans, Richardson says. Plus it will probably take more time for the car to execute the driver's command, she says.

However, "you are going to have to have adequate [computing] horsepower in the car to deliver the experience we are all looking for," Chrumka of Nuance says.

http:/g.fool.com/img/Article/partners/Xconomy200x40.gif

More from Xconomy.com:

Thomas Lee is Detroit Editor and National Med Tech Editor for Xconomy. He can be reached at [email protected]

The Motley Fool owns shares of Google and Ford Motor. Motley Fool newsletter services have recommended buying shares of Google and Ford Motor. Try any of our Foolish newsletter services free for 30 days. We Fools may not all hold the same opinions, but we all believe that considering a diverse range of insights makes us better investors. The Motley Fool has a disclosure policy.

Invest Smarter with The Motley Fool

Join Over 1 Million Premium Members Receiving…

  • New Stock Picks Each Month
  • Detailed Analysis of Companies
  • Model Portfolios
  • Live Streaming During Market Hours
  • And Much More
Get Started Now

Stocks Mentioned

Ford Motor Company Stock Quote
Ford Motor Company
F
$12.31 (-3.60%) $0.46
Alphabet Inc. Stock Quote
Alphabet Inc.
GOOGL
$98.74 (-1.40%) $-1.40

*Average returns of all recommendations since inception. Cost basis and return based on previous market day close.

Related Articles

Motley Fool Returns

Motley Fool Stock Advisor

Market-beating stocks from our award-winning analyst team.

Stock Advisor Returns
339%
 
S&P 500 Returns
109%

Calculated by average return of all stock recommendations since inception of the Stock Advisor service in February of 2002. Returns as of 09/24/2022.

Discounted offers are only available to new members. Stock Advisor list price is $199 per year.

Premium Investing Services

Invest better with The Motley Fool. Get stock recommendations, portfolio guidance, and more from The Motley Fool's premium services.