≡ Menu

The high ROI of data mining for innovative organizations #paw


I presented today at Predictive Analytics World (I will post slides later) and John Elder, one of my favorite data mining presenters, gave a great session on the ROI of data mining. John started by giving me a great plug and then pointed out that one of the reasons data mining has survived as an activity is that it is often, typically used to improve a bottom line number. John covered the hype cycle of data mining and lessons learned as well as the 3 major ways data mining helps organizations:

  • Streamline/automate processes
  • eliminate the bad
  • Discover the good

John began by discussing the whole issue of artificial intelligence and made their great point that people and machines are complementary and that the issue is how best to use them together – how to load share between people and machines. Data mining has been proven and is reaching what Gartner calls the plateau of productivity and this compares favorably with the failures of artificial intelligence to really deliver business value. Anyway, on the three major ways data mining can help. He went on to discuss the 9 projects he was using as examples. He made a great point – all 9 were technical successes but only about half were a business success.

Streamline or semi-automatic decisions

  • HSBC cross-sell/up-sell
    A classic – what is the product that will interest a customer next (to target better marketing campaigns, especially to small groups where marketing can be costly). In particular to use the teller window as a sales opportunity – turn a cost (teller) into a possible gain. They used analytic techniques to develop a heat matrix of product associations so people could see the possible products of interest. This was popular but died when groups were merged and the new executive sponsor was not interested. No business success.
  • Anheuser Busch image recognition
    Anheuser Busch spends a lot of time and money managing the layout of its products in stores. They build a schematic of the layout and it takes about 4 hours to build the picture manually. Using data mining techniques they were able to automate 90% of the recognition for a 10x improvement in time to build. Also identified stock outs and “competitor creep” automatically. Follow-on to proof of concept was due to be signed on 9/11 and was delayed until the executive sponsors left/retired. No business success.
  • Lumidigm bio-metrics
    Idea behind this start up was to shine light on your skin and identify disease. Turned out to be impossible to separate out personal characteristics so they instead focused on identifying you from the reflection of your skin. The challenge was to use data mining to identify you at an accuracy rate that makes sense for the solution. Ended up being used at Disney World to track original purchasers of tickets to prevent re-sale. Business Success
  • Peregrine systems business service modeling
    “Sim City for IT” – help build an environment where could simulate impact of changes in IT systems on service level agreements for help tickets etc. Analytics were used to keep uncertainty to the end. This one worked and company was purchased by HP and this solution was part of the reason. Business Success
  • Social Security Administration disability decisions
    A third of people  are accepted but about 50% of people who appeal disability rejections get accepted on appeal – can take 2 years to work through the appeal. Data mining used to fast track people who are easy. Challenge was text analysis – 51 spellings of “learning disability” plus a whole bunch of other ways to say the same thing – a web of concepts. 20% of approvals could be automated immediately. And as always the model is sometimes more accurate than the humans and sometimes the model found missing or mis-applied rules. About to work on the deployment but political embarrassment at the success of the project caused the whole group to be abolished. No business success.

Eliminate the bad

  • IRS fraud detection
    Built a model that uses past fraud situations to score a return and helps focus the analysts on the most likely to be fraudulent. Able to identify 100 returns with 25 frauds instead of the old 1/100. Business Success
  • Consumer electronics service fraud
    Tips indicated that there was fraud and the model was designed to take known cases of fraud to build a model that would predict other cases. This company made $20M in 9 months by detecting fraud. Business Success

Highlight the good

  • WestWind Foundation hedge fund strategy
    Managing commodities by predicting whether to go long or sell (could not short). Very painful, although the model did well over all, as the short term fluctuations made it hard to see the value of the model. Had to answer the question if the advantage of the model was just random. Found, with a resampling simulation, that for 985/1000 simulations the model did better. And this led the customer to believe in the model and invest in it. Worked for years until the edge the model gave disappeared, which the model detected, then stopped. Business Success
  • Pharmacia and Upjohn drug discovery
    Had to decide from 1,000 data points (double blind studies) each of which cost $10,000 to get (on top of the drug work) to keep going or not – to invest another $1Bn! They were not sure the results showed enough benefit. Did some advanced data mining/visualization to show the difference between a placebo and the drug in 3 dimensions (there were 3 ways the drug impacted the patients). And the difference became clear leading to the drug being continued with. Became a commercial drug. Business Success

Lessons learned – ingredients for success:

  • Gain expected – either an incremental improvement matters or there is low hanging fruit
  • Interdisciplinary team
  • Data vigilance
  • Time to learn over many cycles
  • Business champion, and a persistent one

Comments on this entry are closed.