Syndicated from BeyeNetwork
I have been thinking about lift curves this week (no, really, this is the kind of thing I think about) and I thought it was worth spending a post describing them and their value.
Before I actually talk about a lift curve I need to give you a little background. The purpose of a lift curve is to show you how good a predictive model is. In order to do that you need a baseline – something to compare the predictive model to. This first graph then is not, in fact, a lift curve but the baseline to which we will compare a lift curve.
The graph I am using is to measure the effectiveness of a model that predicts a true/false variable – let’s say that a customer will not renew their subscription (it’s a churn or retention model). The baseline shows me how well a random approach would do. In other words if I ordered my customers randomly and called them one at a time the vertical access shows what percentage of the people who will not renew I will have found for a given percentage of customers called (the horizontal access). The arrow shows that, for instance, once I have checked 40% of my customers I will have found 40% of those who plan to cancel. The baseline can be said to represent the “monkey score” – how well a monkey might do.

A lift curve takes the baseline and imposes a curve on it that represents the performance of a model. In this case I have built a model to predict if a particular customer will renew or not. Because this is a model that does better than random the curve is above the baseline.
Each point on the curve can be read similarly. If I use this model to rank order my customers from most risky to last risky, what percentage of the non-renewers will I have found for a given percentage of customers considered. In this case the model, for instance, detects about 73% of non-renewers by the time I have considered 40% of my customers. This is obviously much better than the random approach which would only have found 40% of the non-renewers at the same point in the process.
So what does this mean in business terms – well it means I can boost my results without increasing my costs or I can reduce my costs without impacting my results.
If, for instance, I had the money or resources to call 40% of my customers using the random approach and I spent the same amount of money to call customers using my model I would boost my results – instead of only reaching 40% of non-renewers I would now reach 73%. Same cost, better results. This is show by the Boost arrow.
But maybe I think it is OK to reach 40% of my non-renewers. If this is the case I can use my model to reduce the percentage of customers I must call from 40% to about 15%. Same results, lower costs. This is shown by the Save arrow.
Predictive models can be used to boost results or reduce costs and which is better is going to depend on business circumstances – not least because predictive models don’t DO anything, they just predict. Unless put to work in decision-making they are of no practical value.
Interestingly, my buddy Eric Siegel is giving a webinar with me on Optimizing Business Decisions – How Best to Apply Predictive Analytics next week (I am giving one on 5 core principles of Decision Management this week) – worth checking out if you want to learn more. Eric and I are also both speaking at Predictive Analytics World, a great event in DC in October. Hope to see you at one or all of these events.
Comments on this entry are closed.
JT – Great post and nice illustrations of why lift curves are a natural way of evaluating the performance of predictive models. I think it’s worth mentioning, for those coming in the analytics space from the engineering disciplines, that lift curves are equivalent to receiver operating characteristic (or ROC) curves.