The Government Shutdown and the Local Data Economy
Whatever your opinions about the ongoing government shutdown, it’s interesting to consider the effect it is having – and the effect it isn’t having – on local search. For the average user of local search apps and services, it has been business as usual so far. The very lack of impact speaks volumes. Local data on the internet is just one of several data clusters that exists within a private information economy of search engines, marketplaces, and social networks that collectively own the means of access to the vast majority of online information. Indeed, the internet itself is managed by a network of private companies that the government would have a hard time shutting down, even if it wanted to.
For the most part, local search appears to demonstrate with flying colors the benefits of getting things done in the private sector. Not only is it a self-sustaining and profitable industry; it exhibits a drive to innovate that brings ever-improving services to our desktops and handheld devices at a dizzying pace. Imagine if local directories and apps were run by the same bureaucracy that manages the Postal Service, the IRS, and the Census Bureau. We’d probably still be using phone books.
Yet at a fundamental level, governmental authorities still act as objective reference points when it comes to information of interest to the public. If you’ve been at home without power or water, you know the experience of discovering with surprise all the little ways you rely on those services to always be available. So too with public access to government data. Websites and APIs providing critical data access to the public are either unavailable or are not getting updated during the shutdown. This includes NASA, the National Archives, the USDA, the US Geological Survey, the Census Bureau, and many others. As the Geological Survey website puts it, “Only web sites necessary to protect lives and property will be maintained.” Yet no information was available from the USGS about a magnitude 3.0 earthquake that hit the San Francisco Bay Area on October 6, until someone, perhaps violating the furlough, went online and updated the site.
No one would argue that the location of the nearest Starbucks rises to the same level of importance as USGS earthquake updates, FDA food inspections, or environmental alerts from the EPA, all suspended during the shutdown. But we do have a little-acknowledged reliance on government data as the standard for validity or as a means to pass information between otherwise incompatible services.
Consider the humble street address. Without it, one can’t easily communicate the location of that Starbucks or the directions to reach it. Though it would seem that street address validity could be derived from datasets of numerous mapping companies, in fact they all rely on the US Postal Service as an arbiter of what it means for an address to be valid. The Postal Service issues and maintains ZIP codes, as we all know. The Postal Service also maintains an address validation standard known as the Coding Accuracy Support System (CASS). CASS software, designed by vendors who are certified by the Postal Service, corrects misspellings and inaccuracies and determines whether a given address is valid from the point of view of mail deliverability. Though designed to improve address accuracy of snail mail, CASS validation is a basic tool in the arsenal of local search directories, which also must provide valid addresses for their services to function properly.
For the time being, CASS validation doesn’t appear to be threatened by the shutdown. The same can’t be said of business category systems administered by the Census Bureau, whose website is down. By category I mean the industry-specific phrase that identifies a business by type: plumber, landscaper, restaurant, insurance agent. Such categories may seem innocuous and self-evident, but in fact, you would be hard pressed to find two online directories that employ the same terminology. This is because most directories consider their method of categorization to be proprietary.
There are some good reasons for this. Algorithmic search is involved in extracting data from directories, meaning there is an advantage to be gained if you’re better than your competitor at connecting a consumer search for “Apple” to the right results (grocery stores or computer stores). As business category is the primary match term for keyword searches, it makes sense to build a system that understands as many variations of common terms as possible. The problem is that any time business listing data is shared between systems, one needs a common language of business categories. Is it a “mover,” a “moving company,” or a provider of “moving services”? Without a common language, there’s no way to assign the proper category.
That’s where the Census Bureau comes in. The Standard Industrial Classification (SIC) system, developed by the Department of Labor in the 1930s, and the North American Industrial Classification System (NAICS), first released in 1997, are maintained by the Census Bureau and together form the lingua franca for business category information. Most online directories license business listings from large data providers, and many provide data in turn to large networks of media sites, regional directories, and the like. Stitching local data networks together are the standard terminologies provided by these classification systems. Without standard classification systems as a means of understanding and translating business categories, publishers could not provide relevant listings for keyword searches.
Despite the current lack of access to Census Bureau services, the shutdown is unlikely to affect our ability to do business in the local search industry in any meaningful way. But it does give us occasion to reflect on the dependencies we have on government services we ordinarily take for granted.