Augmented reality (AR) has lots of implications for local commerce, given that its main function is to fuse the digital and physical. That aligns nicely with the digital/physical handoff that’s inherent in the typical consumer journey, a.k.a online-to-offline (O2O) commerce. We wrote a whole white paper about it.
But AR may not play out in the way you think, at least in the near term. Though it’s generally thought of as graphical overlays on your field of view, another “overlay” could be more viable in the near term: sound. This “audio AR” modality could come sooner than—and eventually coexist with—its graphical cousin.
So what do we mean by Audio AR? As previewed last week by my colleague Joe Zappa, Audio AR uses sound waves instead of photons to inform you about surroundings—and in potentially commerce-oriented ways. It’s a subtle whisper to help you do things like navigate or find local businesses.
I often joke that the original AR was radio. It “augments” your perception of the world while driving, jogging, etc. Audio AR extends that with more intelligent layers of information. Just like graphical AR is evolving in contextually relevant ways (see AR cloud), audio AR could develop similar nuanced layers of intelligence.
But differentiating it from graphics-based AR, Audio has a few inherent advantages. Because it’s so easy to absorb, sound can sidestep AR’s stylistic barriers. That goes for AR glasses (e.g. “glassholes“) as well as the nearer-term play: mobile AR. Holding up your phone, à la visual search, isn’t always a good look.
Audio’s discreet dynamics conversely make it a natural fit. A subtle whisper in your ear can be done without the style crimes of AR’s other modalities. And in addition to the above navigation and local commerce use cases, there are lots of potential killer apps that will build on these natural advantages.
Think of the possibilities with contact-based applications that tell you valuable background info on someone you’re about to have lunch with, or whom you’re shaking hands with at a conference. This turns us all into secret service agents, armed with a subtle whisper of valuable intel wherever we go.
As for how it plays out, the hardware installed base needs to be established, and user behavior has to be conditioned for an “all-day” wearable. This is where AR glasses have suffered. But for audio, we’re already halfway there, given Apple has conditioned users with its most successful product in years: Airpods.
I’m calling this device category “hearables,” and we’re off to a good start given 62 million Airpod-like devices sold in 2018. And as always, Apple’s lead will be followed by hardware commoditization by third parties, as well as falling hardware component costs that engender ubiquitous penetration.
Then it’s all about the killer apps and use cases that develop. Apple could be disadvantaged, given that its AI engine for audio content is the laughably inept Siri. Here, Google has the edge with the far superior Google Assistant, and it’s already making moves towards Audio AR with its Pixel Buds.
Its use cases so far include live foreign language translation. Think of the in-ear translation devices used by U.N. delegates—but for the rest of us (and software/AI based). Other use cases will develop around commuting, dating apps like Tinder, or the corporate use case outlined above, developed by LinkedIn.
The software development will be a key step, when third-party developers could really run with Audio AR by building apps on SDKs released by Apple or Google. This follows the playbook already established by iOS and Android. In fact, audio AR could branch from their respective AR dev kits: ARkit and ARCore.
Back to local commerce, the alignment is pretty clear. Walking directions, à la Google’s VPS, are a logical starting point. That then extends to all kinds of directional advertising and location-based promotions that align with users opt-in preferences. Whisper an alert whenever I’m near an In-n-Out Burger.
There are lots of other directions it could go (excuse the pun). But the foundational principle is that we’ll all become empowered through audio whispers, thus augmenting our realities. Consumers could have in-ear intelligence pushed to them in a range of functional areas that develop over time. Many will be local.
Mike Boland is Street Fight’s lead analyst, author of the Road Map column, and producer of the Heard on the Street podcast. He has been an analyst in the local space since 2005, covering mobile, social, and emerging tech. More biographical information can be seen here.