I caught up with SAS recently to discuss their high-performance computing products. SAS’ high performance computing products are designed to address some known customer challenges such as underutilized computing resources, rapidly growing data volume and complexity, unnecessary data movement, and slow time to results because underlying hardware infrastructure is maxed out. Add in a general increase in analytic demands and the need for higher performance analytic infrastructure is clear.
This has lead SAS to focus in four areas:
- SAS Grid Computing (starting about5 years ago)
- SAS In-database (2.5 years beginning with Teradata, see this post)
- SAS In-memory analytics (coming soon)
This update focused on grid computing. Within SAS customers, the IT department wants a centrally manageable SAS infrastructure that scales out while delivering high availability, workload management and scheduling. Meanwhile their business counterparts want to get results faster despite an increasing need to manage more number of users and ad hoc jobs. SAS Grid Manager is SAS’ product offering and this uses Platform Computing’s technology to create a distributed grid environment providing parallel job execution across multiple servers with shared physical storage. SAS Data Integration Studio and SAS® Enterprise Miner™ are automatically tailored for parallel processing in a grid computing environment. Other SAS solutions, including SAS® Enterprise Guide® and SAS® Risk Dimensions®, can be set up to automatically submit SAS jobs to a grid of shared computing resources. For everything else, like Base SAS code, a few lines of code “grid-enable” it.
Examples include portfolio risk analysis scoring on 400,000 loans reduced from 3 hours to 10 minutes and another where processing time for customer behavior models (segmentation, propensity and retention models) was reduced from 11 hours to 10 seconds. The SAS Grid Computing framework allows customers to link a SAS Metadata Server to multiple SAS servers with shared storage/SAN as well as relational databases and use this as an access point for a range of different modeling tools.
In-database continues to expand with more databases supported for SAS Scoring Accelerator, SAS Analytics Accelerator and SAS/ACCESS. Partners include Aster Data, EMC (Greenplum), IBM, Netezza, Oracle and Teradata. With Teradata, their longest partner in this area, they continue to push more and more modeling tasks into the database also. Catalina Marketing for instance is a big in-database analytics user – based around Netezza data warehouse and SAS software for predictive analytics and model management. This allowed them not only to score models in 60 seconds, made it easier to use more algorithms without worrying about downstream deployment and this increased the number of models built by 10% with the same staff.
There’s more to come soon in the various areas SAS has grouped as high performance but most of that was under NDA and will have to wait for another post.