Picking the Locksmiths: Bizyhood Unlocks the ‘Ghosts in the Data’ Problem
What would you do if you wanted to game Google into thinking you’ve got a vast network of local shops servicing area customers based on their search queries? No need to ask as some sharpies have already figured it out, according to a recent New York Times report. It seems the answer is to become a ghost — or thousands of them. Google acknowledges this is the case as well.
But how do they do it? The practice is not as simple it seems, as it requires fake storefronts and other trickery that ultimately lead consumers to find that the inexpensive local locksmith advertising from around the corner is actually spawned from a lead-gen service a country away, and that friendly nearby locksmith is suddenly very expensive and not so nice. As I said, these folks are shrewd.
Now one startup is taking the practice on by combing their data (from two of the four major data aggregators) and painstakingly removing the offenders. Surely they aren’t alone in the fight but Bizyhood is an example of a small startup trying to deliver the best result for customers no matter the hurdles of fakery.
I talked to self-described serial entrepreneur Scott Barnett, co-founder of Bizyhood, about all this and delved into the backstory to unlock more detail about the ongoing battle. Barnett claims the whole “aggregator model is broken” but that’s not stopping he and others from tackling the problem.
Tell me a bit about Bizyhood and how you got started.
My co-founder Eugene Fabrikant and I started working on Bizyhood in 2012, but really started focusing on it in 2014. We started out thinking we wanted to be a “better Yelp,” but we ultimately realized that it was silly to try to be better at something that was inherently broken. Our focus has always been on hyperlocal communities, and providing a way for businesses and consumers to engage and discover local services online. To this day, I still can’t find good, reliable information about the places, services and events I’m interested in near me, and most people I know say they can’t either.
A lot of folks have looked at this problem. Some of them start with the consumer (such as Yelp). Some start with the businesses (companies like ReachLocal and Yext). We have been focusing on a third source, what we think is the most important part of the hyperlocal ecosystem — the hyperlocal publisher. The local publisher’s site already has local authority, and our approach allows the local businesses and consumers to discover and share information on the publisher’s site. Bizyhood is tying together the three things necessary to dominate hyperlocal: the business, the consumer and the local expert (the publisher). In addition, the local publisher’s do a great job of curation and make sure the content is contextually relevant to their readers, which existing review sites don’t do.
There are many directory services out there for SMBs. Why are you guys different?
We’re different because our directory is part of the platform that we license to publishers, so they can have an authoritative and active business directory on their site right away. We do maintain a nationwide list at www.bizyhood.com, but our focus is to enable business directories on sites like Red Bank Green, Racine County Eye, and Ditmas Park Corner, to name a few.
More importantly, our platform is not just the business directory. The business directory is the foundation for all the other features and capabilities we offer. You can’t be authoritative locally without a business directory, but the directory alone certainly does not make you authoritative! So, it’s the table stakes. Everybody knows what it is so it’s really visible, but to us it’s simply the start.
When did you first see the “fake” listings among your data and how did you discover them?
When we launched the first version of our publisher platform in the Fall of 2015, we started to notice it. We built a WordPress plugin that local publishers could use to easily add the Bizyhood functionality. Our studies have shown that over 80% of hyperlocal publishers use WordPress and they don’t tend to have the ability to custom-build on their site. So, even though we have an API (RESTful) that’s pretty easy to use/integrate, we knew that a WordPress plugin would be heavily used compared to asking publishers to integrate on their own. Part of our functionality is that we show the list of business categories for that community, sorted by volume. It was pretty surprising to see that ‘Locksmith’ was coming up in the top five in many of our communities, and sometimes it was the number one category. That certainly didn’t seem right.
A few of our publishers specifically came out and called us on it. Stillwater Current serves a pretty focused area not far from Minneapolis. The publisher there used to work at Patch and he knows a ton of businesses in his town. He complained almost immediately that there were way too many locksmiths in the directory. He only knew of one legitimate locksmith in town, and yet he had over 40 from our database.
We wondered if it was just us, so we started searching for these locksmiths on other directories (including Google). To our surprise, they were everywhere.
Then we saw this article in the The New York Times about locksmith spam and that clinched it for us.
Tell me about your first thoughts on what to do to ensure the integrity of the data?
We’re still in the very early stages, but we’ve always talked about a “bottom up” approach to building out our communities. In other words, we let the authority stem from the publishers and businesses themselves. We do have a verification process that requires business owners to do a little bit of effort to prove they are the owners — it’s certainly not fool proof, but it does dramatically reduce the number of fake listings. As an example, I went back and looked over the past year … there have been over 100 locksmiths that have requested to add their listing to our site. To date, only one has properly confirmed ownership.
At the same time, over the past year, over 700 businesses have successfully confirmed ownership — and these are overwhelmingly in the communities where we have a relationship with a local publisher. This is obviously a small drop in the bucket of the overall number of businesses in the US, but we’re confident that building up this authority and integrity on a community by community basis is the right approach. We do our own verification, and combined with the publisher’s knowledge and curation of their own communities, we have two distinct parties highly motivated to provide great current information.
So how do you ultimately cleanse the “ghosts” from your data?
For locksmiths, we decided that the percentage of fakes was so high that we would delete all the locksmiths from our own database, and only allow confirmed listings. This means that many communities now have no locksmiths represented in our database, but we felt that was a worthy tradeoff for two reasons. First, not that many people are looking for locksmiths on Bizyhood now anyway (although that will change!)
Second, since our primary goal is to build authority “bottom up” in each community, it’s critical that we start in those communities where we already have a relationship with the publisher. We have already done significant clean-up in many communities (for example, removing listings of businesses that went out of business, which is traditionally a very hard thing to keep current), but this was one area that was systemic across the board and this drastic approach seemed appropriate.
How about other categories? are you seeing a lot of lead-gen shenanigans elsewhere?
There are a few other categories that seem to have a larger percentage of listings than make sense economically for communities of a given size. We haven’t seen anything as drastic as locksmith, but there may be other categories we look at. Towing is an example of a category that seems hard to justify the number of listings. Somewhat related is the number of listings that come in with the name of the town/community in their business name — it’s obvious that these aren’t the real business names, but clearly people feel that having the town name in their business name is a “ranking signal.” But that isn’t systemic across the board, so we simply don’t approve those types of listings, but we wouldn’t delete an entire set of listings like that because it’s not a high enough percentage.
If this is a Google problem, what’s the solution?
This is a data integrity issue. Because there is no central repository for business information, chaos reigns. It’s odd to say this, but I really do miss the days of the Yellow Pages. It was a single and authoritative list of businesses — if you weren’t there, you essentially did not exist. We feel very bad for small and local business owners; this is not a problem they created nor is it a problem they want. But it’s so easy to set up a business directory online, and “near me”-type searches on Google are growing so quickly, there’s a lot of eyeballs on this problem.
We think a great way to solve this is by tackling it on a community by community basis. There was an article recently that explained that Google has teams of people mining search results to verify business listings or determine when businesses open up and shut down. But there are already people that know a lot about this stuff, and in fact report on it as a service to their local community. We feel they are the right starting point to make local data integrity a reality.