I have been following the Data 2.0 Summit folks recently – the Third Annual Data 2.0 Summit 2013 in San Francisco is a one-day conference and speakers include Anthony Goldbloom, CEO of Kaggle, who is always worth listening to and you can get 20% off your Data 2.0 Summit pass by clicking this link. Anyway, the theme this year is that of democratizing data.
Now when most people think about “democratizing data” they are focused on how to make data accessible to more user or to make tools for analyzing this data available to more people. While this might work for some companies, especially the kind of knowledge-worker heavy companies at which the people who write about this work, I don’t believe it is either the only or the most effective way to democratize data in most large organizations. Let’s take a step back and ask “why” as in “why do we want to democratize data?” Well if we want to democratize data we clearly believe a few things to be true:
- That data helpful as it helps us make better decisions
- That decisions made based on data are more likely to be accurate than those driven only by “gut” or “policy”
- That circumstances are creating more opportunities for improving decisions with data
- That the decisions that can be improved with data are numerous and widespread in an organization
Let’s tackle these one at a time.
If we want to use data more effectively we need to apply it to more decisions. That much is clear and well understood. What’s less well understood is the importance of really understanding the decisions that you want to improve with data. Far too many projects focus on the data first, hoping to figure out which decisions will be improved later. Much more effective is beginning with the decision – identifying the decisions we need to improve to improve our results and making sure we understand them, then figuring out how data might help improve the quality of those decisions.
That decisions made based on data are more likely to be accurate than those driven only by “gut” or “policy” is well established – just check all the examples in Super Crunchers for example. The problem of course is that we sometimes have to take account of expert opinion (experts who are ignored will often ignore analytics that contradict them) and apply policies that may not be supported by the data but are “the law.” Once again a thorough understanding of the decisions we are trying to improve will help us see where we can apply data, what the constraints are likely to be and more. It’s not enough to say “let’s use data to improve customer retention,” we have to be specific about which decisions (when to make proactive offers to at risk customers say) and about what the role of data, and analytics will be (help us identify the top 100 each week so we can call them for instance).
The explosion of data sources is real – it’s driving the whole discussion of Big Data in most organizations. Those who can use more data sources and who have a greater ability to extract meaning and value from these data sources (cheaply) will have an advantage. But simply accessing, investigating and managing all these data sources is prohibitively expensive, even if you use open source or highly scalable commodity hardware platforms. But if we know what decisions matter to us then we can focus on getting good at handling the data sources that will make a real difference to our business, focusing our Big Data efforts effectively.
Finally we need to realize that the decisions that can be improved with data are numerous and widespread in an organization. While it is true that better visualization and data communication tools help individuals with their own decisions, these more numerous decisions require a different approach. We are talking about decisions made in the call center, in retail stores, by drivers and front-line staff. These folks don’t need complex visualizations nor do they have time to do investigation – they need smarter systems that use the data to make effective, useful, timely recommendations and suggestions. They need you to deliver this data not to them, but to systems that can help them manage these decisions in real-time – Decision Management Systems.
Democratizing data cannot just mean helping more knowledge workers have more fun with their query and visualization tools. It has to mean democratizing data-driven decision making throughout the organization and that will take a new generation of decision-making systems.