Having posted about Zementis – a company that allows you to deploy analytic models into the amazon coud – before I now see that Mathematica is getting in on this whole cloud thing. Personally I think that analytics and decisioning are ideal for operating in the cloud. Analytics take a lot of computing power when models are being developed, making the flexibility of cloud computing valuable. Decision management in the cloud means that any process, anywhere can connect to the cloud and get the questions answered it needs to operate effectively.
More analytics in the cloud
Next post: New blog location
Previous post: 1:1 Marketing works for the NHL
Comments on this entry are closed.
It is great to see more mathematical applications migrate to the cloud. This is one of the best opportunities where cloud computing can reduce cost and complexity of implementing computational efforts in HPC, large-scale simulations, and predictive analytics.
As James mentioned, we at Zementis launched the ADAPA predictive analytics decision engine on Amazon EC2 which allows users to deploy, integrate, and execute statistical scoring models, e.g., using algorithms like neural networks, support vector machine (SVM), decision tree, and various regression models.
As the model exchange format, we leverage the Predictive Model Markup Language (PMML) standard which is supported by commercial vendors like SPSS, SAS, IBM, Microstrategy, etc. as well as open source tools like R .
With open standards and cloud computing platforms available, we hope to see more solutions emerge!
How much analytical data can be realistically processed in this way. For example, will this work if I have 10 terabytes of data to store in an analytical database? 50 terabytes? 200 terabytes?
Well of course that’s a really good question. I think this works well for compute-heavy (rather than data-heavy) modeling and for executing models once they are built. Like you I suspect that pushing lots of data into the cloud will remain a problem for a while.
I agree. The term “Analytics” today can be divided into 2 different sub-domains (according to IDC) viz. QR (Query and Reporting) and Advanced Analytics. Terabyte scale analytics is really only possible for QR analytics. For advanced analytics (statistics, machine learning etc.) today’s technology (SAS, SPSS, R, Matlab) hardly scale to a few thousand (hundreds at max). So really, advanced analytics is still being run in the order of < 1GB.