Karl Rexer has just released the results from his annual survey of data miners – RexerAnalytics.com/Data-Miner-Survey-Results-2010.html. This year 735 data miners responded to an extended survey. Interesting facts from this year’s results:
- CRM and Marketing remain the top focus area with goals like retaining customers and understanding them better coming top also.
- Decision trees, regression analysis and cluster analysis repeat as the core triumvirate at the heart of most data miner’s palette of techniques with much higher rates of use than the next technique.
- Ensemble models are clearly gaining ground with over 20% reporting some use of ensemble techniques, particularly among consultants
- Text analytics also showed well with over a third identifying some use of text analytic techniques and over half of them combining this analysis with structured analytics (excellent)
- R’s popularity has been growing steadily and this year saw it top the list of tools used.
- Explicability of models was rated important by the vast majority, another piece of excellent news.
- From my perspective far too many miners still report using multiple tools – an average of 4.6 EACH! This worries me as I feel it makes it hard to “industrialize” analytics in an organization – to create and manage reusable modeling assets that can easily be deployed into production.
- As usual explaining data mining to others and access to/cleanliness of data were top issues with data miners sharing some of their suggestions for overcoming these – check out the summary here.
Don’t forget, if you have ideas for this year’s survey, send them to Karl or to me (I will forwarded them on him).