Musings on the challenges of big data in a year of serious hype
There's a reason you haven’t heard more than a handful of big data success stories in 2012. Handling big data correctly is hard, requires huge infrastructure and resource investments, and may not be worth it…yet. According to one survey in November 2012, 60% of businesses said it’s too early to tell if their big data project was successful and produced a proper ROI. It seems that so much of the hype around big data is focused on the technologies you need to buy and the talent you need to acquire (data scientist is the latest fad title), and not on what's most important: what you can do with the data, what value you can extract, what business decisions you can speed up or improve with all that data.
With companies jumping on the big data bandwagon to the tune of $28 billion this year, it's time to discuss why it might be best to ignore the hype for now and focus on reaping insight from the data you have already. Here's why I'm not impressed with your big data:
You don't actually have big data.
The marketing hype can lead you to believe you have a "big data problem" when you really don't. Using the terminology incorrectly has the potential to harm your business, causing you to invest in unnecessary infrastructure when you may be able to leverage what you already have in place. Even Microsoft and Yahoo! have made this mistake. A recent investigation found that "misguided Hadoop installations" were processing small data sets of only 14GB – a total waste of resources. It's likely that you simply have "large data," which practically every company has. Large data – and even most big data right now – can be analyzed and visualized in real time using BI software like arcplan without the need to invest in in-memory appliances like SAP HANA, massive data warehouses like Teradata, NoSQL databases like Cassandra, and distributed processing like Hadoop. There are plenty of insights waiting to be uncovered in your data, large or small.
You don't actually need big data.
Is more always better? Some organizations have turned into data hoarders, eating up vast amounts of storage capacity with what may amount to useless information. If the big data you’re collecting isn't relevant to any particular use case and doesn't enhance the existing information you're already analyzing (and achieving some benefit from), then it's probably not worth it. I hear a lot of people say "what if," and "we may need that data someday," and that's their justification for becoming slaves to their data, as Stephen Few, the IT innovator, educator, and consultant, has put it. We hosted a webinar last year called How to Be an Analyst because we see the lack of analytical skills as the real BI problem today. It's not a data volume problem, but the inability of analysts to think critically, make connections in the data, spot trends, and act on them. We are facing a "walk before you run" dilemma when it comes to big data, and we have to master the small stuff before moving on to truly big data.
You're not maintaining data quality.
We've said it before on this blog, but you must invest in good data before big data. IT has spent so much time, energy, and money on a data governance regime only to see it go out the window as new sources of data are added all the time. Structured and unstructured data has to be relevant and accurate from the beginning, otherwise you undermine the quality you've worked so hard to create, and you undermine the trust that users have in the data and therefore the decisions made based on that data. Forgetting data quality is a surefire way to get ahead of yourself in the big data game.
Stephen Few recently wrote in his summer newsletter:
"Most BI companies want you to believe that your problems can be solved by collecting and storing more data. They don't encourage you to first understand and then use the data that you already have. Why? Because they only know how to build and sell you products for collecting and storing data and for accessing data at high speeds. They don't know how to help you make sense and use of data in meaningful ways. They want you to desire and buy what they sell, not to recognize and demand what you actually need."
I wanted to write this article because the hype around big data is something I truly believe has gotten out of control and is putting the cart before the horse. At arcplan, we work with our customers to help them master the small stuff first – extracting meaningful insights from their data, visualizing it in a way that makes sense to the right stakeholders, putting easy analysis in the hands of business users, and enabling everyone 24/7 access to critical business metrics – all in the name of simplifying and accelerating decision-making. If you've ever heard our CEO Roland Hölscher speak, you've undoubtedly seen his passion for helping companies achieve their goals, big and small, regardless of whether their data is big or small. It's a passion we all share at arcplan. Despite our partnerships with big data technology players, we never advocate investing in these technologies unless a company is ready for it. Data is one of your company's most valuable assets, but only if you make it meaningful.