Big data is without a doubt 1 of the top 5 BI trends of 2012. The hype around big data has driven many companies to hoard massive amounts of structured and unstructured information in the hope of unearthing useful insight that will help them gain competitive advantage. Admittedly, there is significant value to be extracted from your company's growing vault of data; however it is data quality – not necessarily quantity – that is your company's biggest asset. So here are 3 reasons why you should devote more of your IT budget to data quality:
1) Because good data quality sets the stage for sound business decisions.
Sensible business decisions should be based on accurate, timely information coupled with the necessary analysis. Decision-makers need to be equipped with facts in order to plan strategically and stay ahead of the competition – and facts are entirely based on having correct data. Though it’s not as "sexy" as big data, mobile BI, or cloud, data quality should be the foundation of all of these other initiatives.
Admittedly, achieving data quality is tough. Gartner analyst Bill Hostmann says, "Regardless of big data, old data, new data, little data, probably the biggest challenge in BI is data quality." It crosses department lines (both IT and business users must take responsibility), and processes that have multiple levels of responsibility often suffer from the "everyone and no one is responsible" conundrum. It's also a complex process that requires laying out common definitions (what is a customer, what are our conventions for company names – Inc. or no Inc. – for example), performing an initial data cleanse, and then keeping things tidy through ongoing data monitoring, ETL, and other technologies.
But ensuring that your data is timely, accurate, consistent, and complete means users will trust the data, and ultimately, that's the goal of the entire exercise if you see this first reason as the most important. Trusting the data means being able to trust the decisions that are based on the data. Clean up the data you have in place, then you can move on to a strategy that incorporates additional sources of big data.
2) Because you have to.
So maybe reason #1 wasn't enough to convince you that come budget time, you should put a little extra in the data quality column. What about the fact that poor data quality may be leaving you out of compliance with the law? Sarbanes-Oxley (SOX) mainly affects public companies, but those that may undergo a future merger, acquisition or IPO should plan to comply. SOX requires that organizations maintain accurate information and prove it in regular audits by independent agents. Seen from this perspective, poor data will cause a company to be out of compliance and can result in fines, lawsuits, and worse.
3) Because you can't afford not to.
I know what you're expecting – a bunch of fear-mongering stories about companies discovering that data quality issues have cost them millions of dollars. We've already given you those stats in a previous post and you can read more stories in this oldie-but-goodie article from 2001. Reason #3 is something you have to figure out yourself.
Calculating your company’s COPDQ (Cost of Poor Data Quality) is difficult, but the factors to consider (according to Quality Control for Dummies) are the cost to find the errors, the cost to fix the errors, the cost of wasted effort due to bad data, and the lost productivity of workers fixing data problems rather than producing actual work. An easy way to estimate the cost is to go by the "rule of 10:" a task costs 10 times more when the data isn’t perfect compared to when it is. So figure out the cost of a particular task and times that by 10 to get a rough COPDQ. Sometimes that number is enough to convince an executive to make an investment in data quality since you can prove ROI.
Without trust in your current data, it makes no sense to try and tackle big data. So as you continue to hear the big data hype throughout 2012, remember that your time and resources are better spent worrying less about the size of your data and focusing more on continuously improving data quality.