Syndicated from Smart Data Collective
Sugato Basu from Google presented on sponsored search (Ad Words) and how you can predict bounce rate, and thus user satisfaction, for a new ad. Ad Words, of course, are displayed when a search is made and tracking results involves tracking who clicks on the ads and whether they convert, explore the new site or just bounce.
Users want ads to be relevant to their queries or to the webpage content they are viewing. Search engines, meanwhile, want to show ads that users like and will click on. There is also a risk of over-advertising to users – if they have no commercial intent they don’t want to see ads for instance.
Bounce rate is another critical measure. If it is high then users are not satisfied with what they found – “they said yuk and went away”. The lower the bounce rate the better the ad/landing page. Evaluating it is tricky. Advertisers can evaluate bounce rate by seeing if visitors don’t do anything on the page though a user could call a number and show up as a false positive. Search engine companies can track subsequent behavior to see if it was quick enough to imply a bounce. But this can be difficult also as users could start queries in a new tab but liked the landing page and kept it open.
There is a strong correlation between click through rate and bounce rate – interesting as the landing page is new content from the ad. Human evaluation of a site as “excellent” correlates to half the bounce rate. Curiously enough bounce rates vary a lot by language, though no particular conclusion can be drawn. Some keywords have very dependable bounce rates – for example navigational queries (to find the site for the New York Times, say) are very reliable.
Accurate prediction of bounce rate would allow ads to be assessed more quickly, with fewer clicks. This is especially important for ads with low impressions – “long tail” ads. To work on this the folks at Google tried both a logistic regression and a Support Vector Machine regression on two data sets. These data sets have 3.5M training/1.5M test and 4.8M training/2M test respectively. Every ad in both sets had 10 or more clicks. They extracted the ad keywords, ad creative and ad landing page. They had millions of parsed terms, millions of related terms, clusters of terms and categories/verticals as well as similarity measures between the elements of the ads.
They managed to predict bounce rates fairly well, at least for ads with lower bounce rates (of which there are more). The two different techniques had very similar predictive power, a sign of some underlying trends.They are focusing on how to help advertisers reduce bounce rate and on how to have the search engine increase user satisfaction.