April 21, 2017

The (Data) Secret behind Digital Advertising

How do webpages regulate the advertisements that you see online? At last Wednesday’s CDS research lunch seminar, Dr. Yana Volkovich, a senior data scientist from App Nexus, discussed how digital advertising relies on data-driven networks.

In the past, digital advertising operated on a direct buyer-seller model. A company who wanted to place an ad on a webpage would have to call the webpage’s owner, negotiate a price, e-mail their image files, and then wait for the owner to upload the ad. But, as Volkovich explained, today the game has changed. Although the direct buyer-seller model still exists, much of digital advertising now operates on a data-driven ad network and ad exchange model.

When advertisers and websites join an ad network, all available ads are collected in one pool, and all the websites who would like to display ads are collected in another. Then, an algorithm will choose which ad is distributed to a given website within the few milliseconds that a user is waiting for a webpage to load, based on two probability metrics: the highest predicted number of user impressions (how many people will see the ad), or the highest predicted number of user clicks, that their ad will generate.

To make these predictions, multiple calculations must be performed. On one hand, the algorithm must determine what kind of ad they are handling, and identify who the ad’s ideal audience is. On the other hand, the ad network must also analyze massive data sets about the user base of each website within their network. Factors like the average age, gender, location, and occupation of each website’s user base are crucial indicators that determine whether or not an ad will fly or flop.

App Nexus, where Volkovich works, however, goes one step further. After aggregating the highest predicted number of user impressions and clicks, they can also calculate click purchase probability, which refers to how many people will buy the item that is advertised. While this is valuable information, the process requires a more detailed look at a website’s user base, like tracking each user’s cookies and recording the history of their clicks.

Whether the strategies of ad networks raise concerns about user privacy remains part of an on-going national debate. But, as you wait for your next webpage to load, it’s hard not to marvel at the stunning computational work that swings ads on and off websites in a matter of milliseconds. Is this what progress feels like?


by Cherrie Kwok