Probabilistic Device Matching Isn’t Perfect — But It Works

Share this:

mobile phone userThe use of probabilistic matching or statistical IDs to link multiple devices to an individual is often dismissed based on dubious accuracy. While device matching accuracy is a natural question to ask, the expectation of a perfect solution often paralyzes advertisers and prevents them from using what could be an effective cross-device advertising strategy.

The truth is that, yes, probabilistic matching will never be 100% accurate. Companies offering the technology tend to claim anywhere between 50% and 90% success rates in linking users across devices. But even taking the conservative estimate at the lower end of this range, probabilistic matching enables marketers to scale their mobile advertising campaigns with reasonable expectations of performance. For the doubting Thomases, here are some points to consider:

1. On Mobile, Audience Still Matters
Mobile advertising has one native data point that captures most of the attention — location. The truth is that fewer than 50% of mobile RTB impression opportunities come with location data attached to it, and of that portion, only half of it is accurate enough to capture any kind of targeting intelligence. So while location is an excellent signal for advertising context, it is a challenge to get both scale and accuracy for any solution.

Some vendors make big promises. But how many people really show up in a particular store on a given day and use ad-supported mobile apps that transmit accurate GPS coordinates? Even for the largest retail locations, we are talking thousands, and that alone doesn’t support meaningful advertising budgets. One easy alternative for mobile targeting is to chase clicks. But our research into mobile click behavior is that the apps with the highest CTR tend to be games and flashlights, where the click is typically accidental.

We shouldn’t forget what we have learned as an industry from over a decade of cookie-based ad targeting. Audience targeting is an effective way to scale campaigns while expecting economically viable performance. Since mobile devices don’t come with cookies, we can turn to the probabilistic matching to link observed behavior from one cookie-enabled device to impression opportunities on a mobile device. With this form of targeting, a device matching accuracy as low as 50% (taking the conservative estimate) means we’d be diluting the performance by a half. A good look-a-like model can easily perform 5-10x better than random and retargeting generally always does 20x better than random. In the worst case scenario of device matching accuracy, we still identify an audience that will perform substantially better than random. That is a solution that works.

2. Be Skeptical. Figures Never Lie But Liars Always Figure
The alternative to the probabilistic match is matching devices through user authentication. When available, this is clearly a better solution. However, let’s be realistic. Who besides Facebook and Google can deliver an authenticated audience that reaches the majority of the adult internet population? Even big players like Twitter or LinkedIn would be hard-pressed to reach 20% of the U.S. population across multiple devices on a given day.

Some vendors will promise high device matching accuracy based on having an authenticated user base, but if those vendors are not Facebook or Google, then we should question those claims. While any given firm might own or buy access to authenticated user data, it is very likely that this comprises a tiny portion of the targeted user base. The real scale is found through probabilistic match. So again, for deploying meaningful cross-device budgets, the probabilistic match is the best strategy.

3. Think Application Before Accuracy
Strict accuracy (i.e., 80% or above) matters more in some cases than others. As reasoned above, accuracy as low as 50% can still deliver very effective targeting. Beyond that, it is reasonable to think of cross-device retargeting with probabilistic matching as another form of prospecting. If you prospect on desktop, why not do it across mobile too?

On the other hand, for applications such as user-level attribution or frequency capping, higher accuracy is necessary to obtain meaningful results. For these uses — which certain vendors are touting — the efficacy of probabilistic matching is probably not there yet.

Billions of mobile impression opportunities are available daily beyond the reach of the internet titans. As long as the long-tail of apps and mobile web remains ad-supported, this will always be the case.

When it comes to optimized targeting at scale, audience is still king. Cross-device audience targeting and optimization is broadly available only via some sort of probabilistic device matching. This is an innovative practice that was born out of the very structure of digital media and consumer behavior. There is no free lunch here, but there is a viable solution that can move the mobile advertising industry beyond the constraints of accidental clicks and the limited scope of hyperlocal.

-1Brian d’Alessandro is VP of data science at Dstillery. He has led the development of Dstillery’s patent pending machine learning technology, as the company has gone from pre-revenue to supporting 100s of concurrent client campaigns. His current research interests include building autonomous machine learning systems over big data architectures, causal inference and influence attribution.