Stamatis Stefankos gave a presentation on his work with Sunrise Communications to forecast and reduce printing and mailing costs in a CRM context. Sunrise is the #2 telco in Switzerland with mobile and landline customers with 1.7 mobile customers (the majority). They have a typical data infrastructure with a data warehouse that pulls in data from operational data store, billing, network and various external sources. Sunrise is using predictive analytics for churn models, payment risk etc. Their CRM activities are focused, as usual in telco, on retention, cross-sell and up-sell. Channels for CRM include mails, bill inserts (transpromotional), SMS, email and outbound calls. Bill inserts remain important as most subscribers in Switzerland are still receiving paper bills.
The idea behind the billing inserts is that some of the invoices will be sent with a marketing insert of some kind. But with threshold-based billing means that not all customers get an invoice every month and preparing the bill inserts takes times. So the company must decide who will get an insert before knowing how many bills will be sent. So, in one example, 861,000 customers might be eligible for a bill insert and this many are prepared but only 613,000 invoices are sent so 248,000 are thrown away for a cost of $10,000.
One solution would be to use historical numbers about campaigns run in the past – around 80% of customers got invoices, for instance, so could just print 80%+ to have enough. But in campaign selection you typically start of by segmenting your customer base using socio-demographics, call behavior, segments, language(a big deal in Switzerland with 5 languages) and something like churn risk. The number of invoices might vary hugely between campaigns – one might target high usage subscribers and so get close to 100% while another gets only 70%. There is not historical data for each kind of campaign.
Instead they used predictive analytics to generate a prediction for each customer as to whether an invoice will be sent each month. The prediction had to over-predict the number slightly so would not run out but the development cost must be in keeping with the modest returns from saving printing. And the model must be able to run regularly without complex dependencies. The prediction uses historical revenue and invoice data to build a decision tree in SPSS PASW Modeler that is run to score every customer in the database. The model could not use completely up to date information either as it could not be made dependent on a load of the most recent billing cycle. The score can be joined with the campaign selection to see how many of the customers in a particular campaign will get an invoice in the target month and this is then used to drive printing. In the example above they reduced the waste from 248,000 to 40,000.
This is a great example of the need to understand the time horizon of a prediction. The model is only useful if it can be used early enough to drive the printing for a future mailing and the model had to be built using the data that was available at that moment, even though more recent data would have been easier to use.