Predixion Software launched their new product – Predixion Insight – this week and I got a pre-launch briefing. Founded in 2009 based in Southern California and Redmond (development team is ex-Microsoft and the chairman is ex-Datallegro), their focus is on self-service predictive analytics, delivered through the cloud and accessible via Excel. Information workers are the target audience and collaboration and sharing is designed in as a result. The product is accessed from Excel or PowerPivot over the cloud and has been in beta just recently. They have a strong relationship, they say, with Microsoft (the CTO, Jamie MacLennan, led the Microsoft Analysis Services/Data Mining team before joining Predixion for instance) and position themselves as complementary with Microsoft’s data mining solutions (reviewed in my overall Microsoft analytic post). They have a number of partners, including some well established Microsoft partners like Neudesic, and plan to provide a model store that allows these companies to sell models as industry standard components over time. They are also working with Composite software and Lyza (Lyza is reviewed here).
Predixion Insight supports on-premise and private/public cloud deployments and integration with Excel 2007 and 2010. The product allows model development and management and some visualization of models for collaboration. The product has support for native integration to PowerPivot (uniquely as there is no published API so this is something they do thanks to prior development experience), supports some text data mining and uses SharePoint for publishing interactive predictive reports. The product is highly scalable – some of their beta customers tried very large scenarios like 10M records for a market basket analysis.
Data is uploaded to Predixion Insight from PowerPivot (which means you can pull data from any data source supported by PowerPivot) and from Composite Software (allowing all sorts of other data sources to be brought in). Once the models are built they are returned to PowerPivot and can then be shared through SharePoint etc. Predixion Insight also has an API so, in fact, you could upload data from Composite or other data sources into the cloud and then create models before accessing the results using another API.
They work hard to reduce complexity and the product shows as a pair of ribbons in Excel. One aimed at basic users, allows you to analyze key influences, detect categories etc in a very simple way. The other walks through the standard predictive model creation process.
Obviously the first advantage is that of being delivered through SaaS – no need to procure and configure machines etc. The interface is also very wizard-based, walking users through the process (holding out data for testing, selecting columns in Excel etc). It takes advantage of the cloud by running this asynchronously of course, allowing long running tasks to be left running while users do something else. Like anyone using automation in creating models they are working on ways to avoid over-fitting or to easily apply multiple techniques, though some of this is not available in version 1.
Visual Macros can be created as part of the process. This creates a readable report which, when selected, can be re-run easily in Excel like a Macro. These can support parameters etc for flexibility and these can be linked into Excel as usual, allowing you to drive the macro parameters from calculations or data in Excel if you wanted.
Once you have built the model you can bring in new data and score it as you would expect, using Excel and PowerPivot features as you go. Today the API is also batch oriented but they have plans for a more interactive scoring engine. Once scored, of course, the data is in Excel and so reporting and slicing/dicing using the score in conjunction with the rest of the data is easy. Reports like “key influencers” can be generated to see how the model did what it did. Additional tools can do things like derive useful bins for continuous variables and apply them to a PowerPivot for further analysis etc. All of the end results are accessible through SharePoint and allow the use of PowerPivot against this data.
A lot of what the product does is an extension or further refinement of the PowerPivot/Analysis Services capability in Microsoft’s core product. Indeed the ability of PowerPivot to use Excel against granular, transaction-level data (instead of manipulating summary data which is what most Excel users do today) is critical. As well as work on an interactive API they are developing support for bringing in PMML models.