SAS today announced SAS Rapid Predictive Modeler today. Some time ago I got a pre-release look at this most interesting product.
Today SAS sees quantitative modelers working on developing and validating models in conjunction with database architects to manage data preparation tasks. Like me they also find that business analysts work on the application of the model to specific problems (e.g. customer retention, customer acquisition, etc.) and on using model results. SAS sees a need for business analysts to be able to create models quickly or in large numbers without having to rely on a potentially limited analytic modeling resource. Yet these analysts don’t understand the techniques involved, don’t have experience considering attributes for predictive power and don’t have time to do detailed modeling or fine tuning. So SAS RPM is designed to be complementary, allowing the business analyst to become the driver, creating new models in a rapid automated way. To achieve this, SAS Rapid Predictive Modeler (RPM) is a task that is being added to SAS Enterprise Guide and SAS Add-in for Office. SAS Enterprise Miner functions are called when the RPM is run to automatically fit a variety of algorithms and select the best model. Business analysts can then register this model and a quantitative analytic modeler can then use SAS Enterprise Miner to extend the analysis, compare it to other models, refine it etc.
The primary objectives of RPM then are to:
- Empower business analysts to create predictive models quickly
- Provide self-sufficiency with an easy to use modeling engine and reporting generator
- Integrate analytics with BI for decision-making
A user of the SAS add-in for Office or SAS Enterprise Guide sees the new task in their usual menus. When selected, the task does some analysis of the data, selects variables, automates the creation of a number of models and then returns the results from the best fitting model. There are some pre-built transformations (like treating a field with only two values as a binary field or correctly handling one with time series data) and these are handled automatically. The automated task builds a process to split the data into training and validation sets, apply modeling techniques, merge and compare results etc. RPM automatically does many of the tasks a data miner would do with a new dataset – partitioning the data, trying several transformations, imputing missing values, pre-select variables etc. All of these steps are automated and happen behind the scenes for the business analyst. The resulting data mining process, however, exists in SAS Enterprise Miner so that it can be edited, customized and extended.
The result of executing the task shows the user the most predictive variables, how well the selected model predicts the training variable, lift curves and fit statistics etc. A scorecard representation is provided that is particularly nice as it shows how the different elements contribute to the model. The models can be loaded into SAS Model Manager and SAS Enterprise Miner. Through SAS Metadata the results can integrated to other components of the SAS Business Analytics stack for decision making. The SAS Add-in for MS Office and SAS Enterprise Guide also include a Model Scoring task to score new data using a registered RPM model. Over time the range of data mining tasks will be extended to meet different needs.
One key thing is that SAS RPM creates a modeler-friendly process flow and specification under the covers. This is good because business analysts are already building models today without exposing them to the modeling team. Models created using SAS RPM tasks, however, are visible to the analytic team and extensible, allowing analytic teams to collaborate with and empower business analysts. Analysts can create large number of models quickly without these becoming black boxes. Quantitative modelers can edit/extend the models for improvement over time – improving both collaboration and productivity.
RPM models are easy to deploy. The models are saved as SAS Base code that may be executed on any SAS Base installation. RPM users can register the models to the SAS Metadata Server for direct use in other products such as SAS Enterprise Guide, SAS Data Integration Studio, and SAS Model Manager. These products can automate the execution of score code and the deployment to other systems. SAS RPM score code is also fully compatible with the SAS Scoring Accelerators for Netezza, IBM DB2, and Teradata.