≡ Menu

Open Source

The third post in my series on standards in Predictive Analytics is on R, a hot topic in analytic circles these days. R is fundamentally an interpreted language for statistical computing and for the graphical display of results associated with these statistics. Highly extensible, it is available as free and open source software. The core environment [...]

I am working on a paper, for publication in early 2014, on the role of standards such as R, Hadoop and PMML in the mainstreaming of predictive analytics.  As I do so I will be publishing a few blog posts. I thought I would start with a quick introduction to the topic now and then [...]

First Look: RapidMiner 6 Update

I last got an update on RapidMiner back in March 2012. The company was founded in 2007 and has 40 employees worldwide. The free product has reached 10,000 downloads a month and the product has some 35,000 active deployments with over 500 customers worldwide (40 added in the last 6 months). The company recently secured [...]

First Look: Revolution Analytics 7

In the 9 months or so since I last wrote about Revolution Analytics they have released a new version of Revolution R Enterprise 7. This is focused on delivering “Write once deploy anywhere.” R, of course, continues to expand in popularity with the recent Rexer Data Mining survey (reviewed here) putting the percentage of data [...]

Rexer Data Mining Survey 2013

Karl Rexer just published his summary of this year’s Rexer Data Mining survey. As always there’s lots of good information but here are my favorite takeaways: As in our own work on Predictive Analytics in the Cloud, the survey fond that a focus on customers and on customer experience/engagement was top of mind. CRM/Marketing is [...]

I got an update on Red Hat JBoss BRMS recently. I last wrote about them with release 5.2 back in 2011 and JBoss BRMS 5.3 is the current release and includes their support for business rules, business process and event processing (based on the Drools and jBPM open source community projects) with a repository, runtime [...]

First Look: Datameer

As part of an ongoing expansion of our ecosystem mapping to include more Hadoop-based products I recently got an update from Datameer. Datameer was founded back in 2009 by Stefan Groschupf, who was one of the original contributors to Nutch, the open source project that spun off Hadoop. Prior to starting Datameer, he and the [...]

First Look: Cascading Pattern

As part of some ongoing research on support for PMML I recently spoke with Concurrent. Concurrent is an enterprise software company focused on simplifying Big Data development on Hadoop. The company’s core product is called Cascading. This is a free, open-source, development framework for Apache Hadoop designed to let developers build sophisticated data processing applications [...]

First Look: OpenL Tablets

I got an update from Exigen Services recently on their OpenL Tablets business rules product. Exigen Services is a global IT company focused on core systems transformations and management consulting. They have 10 delivery centers around the world and about 1,500 professionals.  They work in most industries with a focus on insurance, financial services, pharma, [...]

Stuart Wells, the CTO of FICO, came up to give the keynote and talk about FICO’s big cloud announcements: FICO Cloud-based Decision Management Platform (announcement here) FICO Analytic Cloud (announcement here and sign up here) Customer engagement applications  on this cloud platform announcement here (discussed also in this blog post). Stuart began with a couple of stories. [...]

Phil Francisco came up to talk about the new PureData System for Hadoop. He began by pointing out that just because something is open source does not mean there are not real costs involved. To make Hadoop adoptable and usable for enterprises, easier consumption is needed. Hence the PureData Hadoop appliance designed to simplify building, [...]

Since I last wrote about Revolution Analytics (back when they announced their Netezza relationship) they have added some new management and are growing fast. In particular they see lots of big data experimentation evolving into actual projects. To recap, Revolution Analytics is a commercial analytics company based on the open source R statistics language. The [...]

Rexer Analytics have just released the results of their 2011 survey – the 5th annual one, answered by over 1,300 data miners from 60 countries in the first have of 2011. The survey continued to show that CRM/Marketing, Financial and Insurance are the major commercial focus areas for data mining. It also reiterated the top [...]

First Look – Rapid-I

Rapid-I provides open source software for predictive analytics, data mining and text mining. Incorporated in 2006, they are based in Dortmund Germany and have been working on RapidMiner since 2001. They have over 35,000 production deployments and more than 400 customers in 40 countries. Banking and financial services is their largest market followed by Pharma [...]

KNIME is an open source data analytics product based in Zurich, Switzerland that I last wrote about a couple of years ago. They have been working away on the product since then (having started development in 2004 and released their enterprise components in 2010) and have been refining their business plan at the same time. [...]

First Look – Talend

Talend has been around for about 6 years and the original focus was on “democratizing” data integration – making it cheaper, easier, quicker and less maintenance-heavy. They originally wanted to build an open source alternative for data integration. In particular they wanted to make sure that there was a product that worked for smaller companies [...]

Steve Mills opened up the discussion talking about Big Data, making the point that the art of the possible when it comes to data has been growing steadily for many years – though the current explosion in data is pretty impressive. For instance 1.3B RFID tags in 2005 and 30B in 2010, 4.6B mobile phones [...]

I got an overview of JBoss’s new intelligent, integrated enterprise approach as well some of their new product announcements. They are adding new data services, BPEL support, productizing their event/rules combination and adding some new connectivity elements. The world has obviously changed in the last few years. The fully automated processes of the past, JBoss [...]

CRISP-DM – Cross-Industry Standard Process for Data Mining – is the best known data mining methodology out there. It’s been around a long time but ownership/management of the consortium that developed it has gotten complex recently (the CRISP-DM.ORG site is down at present for instance but you can get some details in the CRISP-DM Wikipedia [...]

Update – Zementis

I got a chance to catch up with Zementis. I last got an update from them on the product back in May 2010 when they released 3.0 with their support of Drools as well as PMML. Zementis is reporting good momentum with large customers such as the US Army and Verizon as well as a [...]