First Look – Revolution Analytics

May 6, 2010

in Analytics, Data Mining, Product News

Share

I got my first formal briefing from REvolution Computing recently. REvolution has been around for about 2 years. Originally they focused on bringing parallel computing power to R and providing some consulting around the language. They raised some new funding recently and now have a new management team, including CEO Norman Nie (co-founder of SPSS) and CTO David Champagne (previously chief architect of SPSS) Today (May 4), REvolution Computing has become Revolution Analytics with a new vision, product roadmap and academic program.

They see a “perfect storm” for predictive analytics and R – exponential growth in the volume of data, statistics becoming pervasive in academia (even at an undergraduate level), academic users are overwhelmingly using R, and commercial tools are now owned largely by very big companies creating a market gap. Their objective is to make Revolution Analytics into a world class analytics company, leveraging R.

Now R is very complete – perhaps the most complete- statistical programming language with no known statistical expression that cannot be expressed. As a result it has become the hard core researchers’ tool. So expansion will come from

  • Ease of use to allow non-PhDs to use it
  • Power and scalability to eliminate some of its performance issues to increase its commercial/real-world projects
  • Enterprise readiness in terms of support etc.

Revolution Analytics’ roadmap is designed to build on the open source R routines in a series of steps.

  1. Revolution R 3.0 – a 64 bit IDE/execution environment for R (currently available)
  2. In a month or so rolling out “big data” capability to alpha testers for a late summer GA.
  3. Web services to run R-based applications
    An ability to deploy an R model as an executable element that can be used remotely. Deployment is “cloud ready” but aimed at multi-CPU server systems.
  4. Thin client GUI for modelers
    A browser-based modeling environment for those building models but who are perhaps less sophisticated modelers than core R users.
  5. SAS to R translator

Revolution has enhanced the base R product with a Community edition. Revolution also offers an Enterprise version, which they are enhancing with new features. They use an “open core” approach with CRAN-R (base package) at the center and hundreds of open source community packages. Most of what they add is going to be proprietary, though some components will be a mix of proprietary and open source. They also recognize that they need to have a complete stack and that this means partnering at the data layer (database/data warehouse integration) and at the BI/presentation layer.

They are also launching a new program for academics. There is lots of overlap between the R academic community and the open source community. With academic budgets even tighter than usual academics are focused on cost but they aren’t particularly worried their tools being open source so Revolution Analytics is going to offer them the full commercial product for free. They want schools and colleges to use Revolution R not just R. This will help them leverage thought leaders among the academic community who do consulting on the side. They also hope their code partners program will mean that the routines that come out of this academic program can use the Revolution tools to make it easier to adopt these routines outside the academic community.

All of this is great but there are obvious challenges. For one, R has traditionally been the preferred tool for those who like to write their own algorithms. As analytics expands it is time for most companies to hire people who know what the routines do rather not PhD mathematicians to write new algorithms. Ensuring that R fits here, with these more use-oriented and less research-oriented folks, is a key for Revolution Analytics – one of their goals is to make R accessible to people who are not R users. One way they will try to expand R’s user base is through their upcoming GUI.

There will also be a tension between Revolution and the open source community as Revolution generates new IP and makes it proprietary, especially when others in the research community build on proprietary routines. As Revolution develops more IP the community will expect to see them release more of this back to the community and this is certainly their intention – first by releasing the APIs then the actual code. Unlike previous efforts to build on R they are not forking the core R engine. Revolution expects to build on the core language always and add things when necessary, for instance. the enterprise may need a package that has not yet been created by the academic R community, due to differing interests.  They also plan to make it easier for folks to use R from a community perspective by releasing packages to make it easy to find add-ons/extensions. Revolution is also launching a new community site today, Inside-R.org, which will offer a central place for the R community to interact online, as well as a repository of tips, blogs, and R packages.  Revolution’s open-source R community distribution will also be available for download at the site.

While there are other open source data mining tools (Knime and RapidMiner for instance), these are not based strongly on R. At the same time the support of SAS/SPSS for R is, they feel, the result of pressure to use R and strengthens their position rather than weakening it.

The initial focus is, frankly, mostly around decision support not decision management which is disappointing but I expect to see that evolve especially as they release the service-based comments. I am also interested to see how they might build on R’s already strong PMML support.

Share

Previous post:

Next post: