A little while ago I got an interesting question from a reader about champion/challenger testing – an element of adaptive control. Check out this brief on Adaptive Control or read the chapter in Smart (Enough) Systems for details on the approach. Anyway, here’s the question:
When testing a champion strategy with challengers, I assume sample size and number of challenger strategies are very important — to test validity and conclusively confirm the existing or decide for a new approach (i.e., to rule out that you just happened to come across customers that preferred one over the other but are actually not representative of your general clientele/customer base and to not only test outliers/extremes but a variety of different options — although I understand that you also do that by repeating the test.)
Do you have any guidance/examples on appropriate sample sizes and number of challenger strategies to use for adaptive control tests?
This is, of course, a question with a “depends” answer. Generally a percentage of all transactions are applied to each challenger – typically less than 5% (if the alternative is a radically different one or of high risk, perhaps only 1 or 2%) with the remainder using the champion approach. As the question notes, you do need these challenger populations to be large enough that you can be reasonably sure that a variation in your results is not just random.
I think you should consider that a challenger segment of less than about 2,500 would be potentially problematic. This depends on a lot of things – the variation of actions being considered (how different is the challenger), the number of distinct segments in the population and the number of actions available to the challenger for instance. Many challengers are tested on a single segment of the population so you already “know” that everyone in the challenger group is similar as far as the desired behavior is concerned, increasing the coherence of your answer and reducing the minimum size you need.
Running the numbers quickly this means you need a transaction volume around 50,000 as a minimum for the approach – 5% of that is 2,500. As an aside, one document I found online (from Experian) said that 35,000 was a good minimum size when performing champion / challenger testing on million-person portfolios but this seemed significantly larger than an absolute minimum.