≡ Menu

Some thoughts on Industrializing Analytics


Some time back I wrote a post called “It’s time to industrialize analytics“. In an ongoing twitter conversation (I am @jamet123) I referenced it and provoked some interesting responses that seem worth addressing in more than 140 characters. The conversation was between @JAdP@deanabb, @merv, @ajbowles and myself and they are all worth following.

We spent some time discussing the analogy of furniture manufacturing as it applies to predictive analytic models. Too many models, I believe, are still produced using age-old (20+ years) hand crafted approaches with scripts, hand tuning, manual variable creation and much more. To make predictive analytics pervasive we need to industrialize this process and focus more on utility and less on craft. Finely crafted models are all well and good but the amount of economic value has to be considered. This is not to say that I believe all the art in models can or should be replaced with automation, only that this art and craft must be applied in a repeatable, manageable framework. Scripts become managed workflows, automation trawls through candidate variables, models are managed and monitored once deployed and so on.

One challenge raised with this approach is that “cookbook” models can be misapplied. This is true but not I think a function of an industrial analytic process. Quite the reverse in fact as the use of a more industrial mindset allows for more models to be built more quickly. This allows, for instance, a model for each customer segment instead of one for the whole portfolio.

There was some discussion that if one industrializes or engineers processes & practices then at some point there will be accountability issues. Again I think the reverse is true. For many current models the only record of how the model was produced is the script on the modelers PC. There’s no documentation, no repeatability, no shared or shareable assets. I believe that a modeled, managed, industrial process for analytics is fundamentally more accountable not less.

It should also be noted that many of the tools that support a more industrial mindset are able to bring the art to bear effectively also, allowing you to decide if the last little improvement in model accuracy is worth it for each model.

Finally there was some discussion of the tools available out there. I review a lot here of these on the blog including tools aimed at non-modelers like SAS Enterprise Miner Rapid Predictive Modeler, IBM SPSS Modeler Advantage, KXEN and Predixion as well as tools aimed at providing broad automation and effective “industrial” support to modelers such as KNIME, Statsoft and FICO Model Builder. Plus of course there’s a whole range of deployment options to “industrialize” the process of getting these models deployed.

All this and more will be covered in the  forthcoming report on Decision Management Systems platform technologies.


Comments on this entry are closed.

  • Jos Verwoerd February 2, 2012, 6:24 am

    James, fully agree with your assessment. There are a lot of excellent business opportunities lost because of the limited availability of models and of the tools and skills to create those models. Think of the value that could be created from data that is lying around useless in almost every company – big and small. Simple, plain, ‘average’ modeling would be a great starting point. It could add value immediately through automated decisioning and predictions. At Bigml’s we speak of ‘democratizing machine learning’, as a fascilitator for industrialized analytics. Looking forward to what all these developments will bring us.