I recently met the folks at Dulles Research, a scientific computing outfit focused on analytic solutions for enterprise software. The folks who founded Dulles had previously worked at Marketswitch (focused on large scale optimization and sold to Experian in 2004) and have a long history of working with companies who are serious about using analytics to solve complex business problems.
One of the things that is true about analytics and decision management is that SAS, especially “Base SAS”, is very widespread. I don’t know the numbers but I suspect it could be as high as 90% penetration. While some of these companies are using Enterprise Miner and Model Manager, many analytic groups stick with Base SAS. This is a very powerful product but one that has a limited range of deployment options. Enterprise Miner offers PMML export, in database/data warehouse deployment and code generation. But Base SAS scripts can only be executed using Base SAS – though Base SAS does run on most platforms.
This means that companies using Base SAS to develop their models have to either buy additional licenses to run their models in production or they have to recode their models by hand. My (qualitative) observation would be that most use additional Base SAS licenses in production for batch scoring and re-code the models into Java or COBOL for real-time or interactive scoring. Recoding in this way is, of course, problematic as delays in implementing models adversely affect the accuracy of the model once it is deployed. Nevertheless this kind of recoding is epidemic, with large (probably very large) numbers of Base SAS models being recoded.
Dulles Research has a way to resolve this issue. Their product takes Base SAS scripts (models) and converts them to Java. This allows modelers to use their favorite environment and to take advantage of its flexibility and power but still produce code that an IT department can run in a standard environment. And do this quickly, without a costly delay from manual recoding. Once this Java is generated it can be used to operationalize the models in a rules engine, execute the Base SAS models in-database or in-warehouse (using the ability many products have to run Java in the database or warehouse) and deploy models where no Base SAS licenses exist. This means the models can take advantage of all the work the IT department has done and the parallelism built into modern data warehouses like Teradata and Netezza.
I know one of their customers (a major financial services company) well. This company is very focused on decision management and uses Base SAS and Dulles Research’s product in combination with a business rules management system. SAS Stat is used to build models – Base SAS programs – that will be run against 10s of millions of customers to score them, with the results of this being stored in a data warehouse. This company now generates Java code for these models for use in real-time scoring, saving multiple FTEs over a several month recoding effort each time. This Java version of the model is called inline by the Java rules engine that is processing the rules for making decisions, allowing the rules and models to be combined in a pure Java environment.
This is a really interesting product. For companies that have Base SAS addicts in their analyst community – and most do – it allows those folks to continue using Base SAS while generating an IT-friendly deployment of the resulting model. As no-one ever knows both environments – SAS people don’t know modern IT platforms and IT people don’t know SAS – this really simplifies matters. The product handles testing (using the same SAS datasets the modelers used to verify the SAS scripts) and generates input, output and model classes ready for compiling and deployment to standard Java environments.
Products like Enterprise Miner give you the option to deploy models for in-database or in-warehouse execution and support widely used industry standards like PMML, giving you lots of deployment options. Dulles Research gives this same flexibility to models built in Base SAS. You can try the product out at http://demo.dullesopen.com/carolina/servlet/ where you can paste your Base SAS code and see the Java it would generate.