Syndicated from BeyeNetwork
I recently had the chance to talk with Karl Rexer, President of Rexer Analytics, about their recently launched Data Miner Survey. This is the 3rd year they have run it and I blogged about the results from last year’s back in October.
JAMES: This is your third year conducting this survey, what new information are you hoping to learn this year?
KARL: First, we’re excited to track responses year to year, and to start to see the trends in the industry. Last year we saw an increase in the use of time series analysis and the number of data miners using large data sets (of a million records or more). Each year our participants suggest new questions, and the people I speak with at conferences also provide great feedback and new ideas. This year we’ve added some new questions to inquire about how people feel the current economic situation could affect the industry, how data miners go about determining the success of their projects, and how companies view the sophistication of their own analytic capabilities.
JAMES: How many people participated last year?
KARL: 348. It was an increase from the first year. And we anticipate even more this year.
JAMES: You mentioned questions about the current economic climate. What specifically are you looking at?
KARL: Our primary interest is to see whether data mining is following some of the general trends that we see in business: whether companies are moving their data mining capabilities to less expensive offshore options and whether they are looking to outsource some of their capabilities rather than bring on full-time staff. We are also curious as to what the mood in the community is about prospects for the coming year, whether there are anticipated increases or decrease in the number of data mining projects and where those changes may occur.
JAMES: What things have surprised you the most in conducting this survey?
KARL: Well, first of all, just how much work it takes to conduct the survey every year! But seriously, just how versatile data mining has become as a field – how many industries it has penetrated and how much influence it can have on major business operations, product and service development, and strategic thinking. Those of us in the field knew that it had this kind of potential but I think in these surveys we really see that potential coming to fruition. Data miners have likewise themselves increased their versatility, employing a wide suite of analytic techniques and algorithms, as well as becoming more and more savvy about how their analytics can be applied in very practical ways in response to business needs. But there are still significant challenges.
JAMES: Like what?
KARL: Mostly in the quality and availability of data. Three-quarters of data miners cite “dirty data” as a typical challenge for their organization and over half mention the availability and access to data.
JAMES: Turning to the software side of things, what are the primary things data miners consider when selecting their analytic software?
KARL: Unsurprisingly, they want reliability, efficiency, and accuracy. Specifically, they want their data mining tools to be dependable and fast. Additionally, they want their tools to be able to handle and manipulate very large data sets, have output that is easy to interpret, and have the ability to automate very repetitive tasks.
JAMES: Do data miners tend to be loyal to one software package or use different tools for different needs?
KARL: It varies depending upon a lot of things such as the context in which the data miner works, their financial resources and their facility with open source tools. Although there are clearly a couple of major, well-known players in the data mining software space, there are several tools that are getting increased visibility over the past couple years. Last year, the average number of tools used by each data miner was just over 5, so there are certainly many people with substantial flexibility and a willingness to utilize a broad range of resources. We also added a question this year to look at whether people plan to stick with their primary analytic software tool or would consider switching.
JAMES: Well thank you for speaking with me and good luck with your current survey. I look forward to seeing what you learn from the data mining community this year.