Karl Rexer of Rexer Analytics sent me a note the other day about some early results of the 3rd Annual Data Miner Survey in the Spring of 2009. Like the previous surveys (I blogged about the 2008 survey), it examined data miners’ algorithms and tools, opinions and views, types of data analyzed, challenges encountered, and solutions provided. 710 data miners from 58 countries participated in the survey. The details won’t be released for a little while but Karl presented some highlights at the recent SPSS user group and he and I thought my readers might enjoy seeing them:
- The most commonly used algorithms are regression, decision trees, and cluster analysis again. These three come in way ahead of any others.
- Half of data miners feel that their results are helping to drive strategic decisions and operational processes
- CRM/Marketing is still the number one commercial focus area among responders (where SPSS dominates) followed by financial services (where SAS does).
- Data quality remains data miners’ top challenge but this year’s survey saw a drop in the number of data miners listing data quality and data access as challenges.
- Explaining data mining and its value to others remains a challenge.
- SPSS and SAS dominate the survey in terms of tools used but Statistica and R coming along
- The survey is dominated by people who do data mining (obviously) with half of responders spending most or much of their time data mining with almost all saying at least some. Even among this group, though, 19% feel that their company has minimal or no data mining capability.
- Only 8% are offshoring analytics work, much lower than I would have expected.
- Disappointingly most miners, nearly 60%, still measure project success by model performance. Efficiency, insights, revenue growth and ROI were all less common. As I said the other day, business people don’t care about lift curves, they care about results. Too many data miners forget this.
- Data miners are overwhelmingly happy with their primary tools. SPSS users scoring especially highly (100% of IBM SPSS Modeler users are satisfied or very satisfied). There’s clearly not going to be much movement among existing users either, with only tiny percentages saying they are likely to change tools.
I look forward to getting the full survey results and I will link to them when Karl publishes them. Email Karl firstname.lastname@example.org to get the results when he does.