Syndicated from BeyeNetwork
Curt Monash has been Thinking About Analytic Speed over on The Intelligent Enterprise Blog and makes some good points about the different kinds of analytic speed. One area I find lots of confusion in discussions of analytic speed that Curt does not touch on is the difference in time to build an analytic model and the time to execute it.
When you are using analytics in a real-time, operational system (and you should be) there is a big difference between building a new model in real-time and executing an existing model (scoring the transaction) in real time. Many systems require that you calculate the value of a predictive model for the current transaction in real-time so you can use it – how likely is this transaction to be fraudulent, what’s the retention risk of this inbound customer – but many of these models can be built offline.
You can harness offline processing power to build models, crunching lots of data and trying many different algorithms before deploying the result of all this work as a simple to execute element of a decision – it is often just a few rules, an additive scorecard or a formula. Just because you need the result in real time does not mean you need to figure out the math in real time. Something else to bear in mind when worrying about analytic speed.
If this is a topic that interests you, why not come to Predictive Analytics World next month and hear me and a bunch of other interesting people tell you all about it?
Comments on this entry are closed.
James, excellent summary.
I would like to add one more dimension of analytic speed – the time to deploy an analytic model for use in any operational system or Enterprise Decision Management context. Let’s call it time-to-market for the predictive model.
We often see a significant gap between building a model and executing it, e.g., in real-time. Often the effort to translate such a predictive model from the scientists’ data mining environment into an executable decision element takes weeks, if not months, and often relies on custom coding.
The Predictive Model Markup Language (PMML) standard closes this gap and allows users to share models between vendors and environments, literally cutting the time-to-market (deployment) to a few minutes.
For example, Zementis leverages the PMML standard to deploy models from virtually all major data mining vendor tools on the Amazon Elastic Compute Cloud (EC2).
Towards lower Total Cost of Ownership (TCO) for Predictive Analytics.
As James points out, predictive analytics is a great way to reduce cost and drive efficiency — and a predictive analytics project no longer has to be a high risk or costly undertaking! Once you have developed your decision models, it is time to reduce the cost of deployment, integration and operational execution of predictive analytics.
Open standards, like the Predictive Model Markup Language (PMML) , and Cloud Computing deployment solutions offer a cost-effective, on-demand entry into real-time, operational predictive analytics.
For an overview of what the leading data mining vendors have to say about the topic, please see the
KDD 2009 Panel Report: Open Standards and Cloud Computing