Every Wi-Fi network in every home and business broadcasts both public data -- such as its network name and unique machine identifier -- and "payload data," or actual content such as e-mails and Web pages. For the last several years, Google
To stem concerns about the potential misuse of the data, the search giant has temporarily grounded its Street View fleet and is working with regulators in Europe -- where an audit request this month triggered the discovery -- to ensure that the private data is properly deleted. But while Google has traced the problem to a communications breakdown between its software engineers and Street View project leaders, a local observer familiar with location finding technology says the crisis may have originated earlier, with specific technical decisions about how Google collects Wi-Fi data.
"It's really a matter of the questions you ask each [Wi-Fi] access point," says Ted Morgan, CEO and co-founder of Boston-based Skyhook Wireless. "There are a couple of different approaches to getting the signal data; one of them is active scanning, and the other is passive sniffing. Both techniques have their pros and cons, but when you are doing the passive sniffing you have to make sure you are not accessing private network messages. It's not a hard thing to do; you just do not record those messages."
Skyhook has been collecting data on the locations of Wi-Fi networks around the world since 2003, to feed the database behind the location-finding software that it licenses to mobile device makers such as Apple, Motorola, and Dell. Skyhook has used only active scanning to collect the data, Morgan says, whereas Google's Street View teams employ passive sniffing.
And that's what seems to have set up Google for the current crisis. In a post on the company blog on Friday, Alan Eustace, a senior vice president of engineering and research at Google, said an engineer working on an experimental Wi-Fi project in 2006 "wrote a piece of code that sampled all categories of publicly broadcast Wi-Fi data. A year later, when our mobile team started a project to collect basic Wi-Fi network data like SSID information and MAC addresses using Google's Street View cars, they included that code in their software -- although the project leaders did not want, and had no intention of using, payload data."
Google surveys Wi-Fi networks for the same basic reason Skyhook does -- to provide an additional way, beyond GPS and cell tower triangulation, for phones (in Google's case, those powered by its Android operating system) to determine their locations. The devil, as always, is in the details. In active scanning, Wi-Fi surveyors driving down a public street send out probe requests that ask every Wi-Fi access point within range to respond. This happens very quickly. The downside is that if an in-range access point happens to be busy -- say, helping its owner download e-mail -- it won't respond to the probe request, so the surveyors will miss that network.
The way around that problem is to use passive sniffing, which picks up all of the traffic traveling over active Wi-Fi networks, including key identifiers such as SSIDs (network names) and MAC addresses (similar to serial numbers, these are unique to each Wi-Fi router). The downside of passive sniffing is that it's slower than active scanning, since routers may be broadcasting on any of a dozen channels, and each must be sniffed individually. "And you have to make sure you do not capture any of the network messages," says Morgan.
Skyhook has never employed passive sniffing, in part because of the privacy challenges, Morgan says. "We have just found [active scanning] is more consistently reliable," he says. "We feel very comfortable with the data we're collecting, and it also keeps us from ever having to be perceived like we're in the kind of situation that Google's in. It's actually impossible, with the approach we take right now, to observe or capture any private network data."
Nor would it be possible for Google to record such data completely by accident, Morgan says. "At the engineering level it's very easy to know whether you are capturing this data or not," he says. So the error at Google, he says, probably happened "higher up the food chain…An engineer doesn't care, and grabs whatever he can. But when there's no one looking at it who's got the broader perspective to understand the implications, that's where the breakdown happens."
Morgan says the choice to use active scanning at Skyhook was part of a sensibility about privacy concerns that was baked into the startup's business model from the beginning. "It had to be, because early on, this was kind of an off-the-wall idea," he says. "This was before there was such as thing as Street View cars, and people didn't know what to make of it. So the first they would ask is, 'What are you scanning for?' We try to be very open about what type of data we collect and what we use it for so that we don't get tripped up in situations like this."
Eustace said in Friday's post that Google will work with an outside party to review its Wi-Fi scanning software and confirm that the recorded payload data has been deleted appropriately. The company is also reviewing its procedures "to ensure that our controls are sufficiently robust to address these kinds of problems in the future," Eustace wrote. He pointed out that payload data can be made inaccessible even to malicious passive sniffers by using the encryption features built into all modern Wi-Fi routers.
And he issued a dramatic apology. "The engineering team at Google works hard to earn your trust -- and we are acutely aware that we failed badly here," Eustace wrote. "We are profoundly sorry for this error and are determined to learn all the lessons we can from our mistake."
- IRobot Sends One-Man Army to Detroit in Advance of Planned Invasion
- The Future of Patent Wars: More of the Same, but Less Litigation
- Raised in a GM Family, Jason Forcier Driving A123Systems' Growing Auto Battery Biz
Google is a Motley Fool Rule Breakers recommendation. Apple is a Stock Advisor selection. The Fool has a disclosure policy.