It’s been a while since I last got an update on RapidMiner (RapidMiner 6 was the last version I reviewed) and they have some new positioning and product capabilities I wanted to catch up on. RapidMiner began as an open source product company founded in 2007. They moved to Open Core in 2010 and now have over 600 customers, 250,000 users and are now headquartered in Cambridge MA.
RapidMiner increasingly sees that analytics, and analytics on Big Data, is not just a source of competitive advantage but has become a requirement. They position RapidMiner, their product, as an easy to use modern analytics platform that improves productivity across a wide range of computing environments as well with a very wide range of data. The product is open source and works hard to leverage the input from their open source community.
RapidMiner sees an increased focus on data scientists/business analysts rather than the more traditional “quants”. In this market there are some significant skills gaps in the market and RapidMiner believes that data/business analysts with domain expertise as well as some computer skills are central to resolving these gaps. Targeting these users, RapidMiner tries to accelerate time to value through templates and one-click deployment, makes it easy to connect to all an organization’s data source and simplify the process through code-free development.
One particularly interesting opt-in feature is that RapidMiner anonymously collects meta data about analytic processes from their users and then uses machine learning algorithms to analyze this data to make recommendations and suggestions to users. This recommends steps, operators, parameters and more.
The platform includes
- RapidMiner Studio, a fat client desktop application for developing graphical flows that define analytic processes
- RapidMiner Radoop to deploy computation (operators in a RapidMiner project) to Hadoop for distributed execution based on maximizing the performance given your current Hadoop environment
- RapidMiner Streams that works similarly using Apache Storm
- RapidMiner Cloud, an elastic compute environment in the cloud (blogged about here)
- RapidMiner Server for enterprise analytics and includes a repository, web services API for integration as well as a web app designer for simple UIs that include analytics.
The overall platform has over 1,500 operators including hundreds contributed from the open source community. They support a wide range of data sources, many compute engines and various deployment approaches that allow them to be integrated into a wide range of BI and programming environments.
A new release of RapidMiner has just been announced (see Ingo’s blog post) RapidMiner Studio 6.4 includes improved collaboration and documentation of analytic workflows, extended model management, more seamless integration of R and Python code, support for Splunk and a template-based approach for adding your own extensions.
RapidMiner is one of the vendors in our Decision Management Systems Platform Technology Report and you can get more information on the product at RapidMiner.com