Accuracy and reliability. Who can argue with that?
But the reality is that your business doesn't give a hoot whether we reach data purity if they don't believe it impacts them. I've seen a DQ "expert" lose all credibility for yelling data quality fire. So that's why I would caution throwing around high level terms that risks thinking we can merely try to make all our data reliable and accurate (whatever that is) then our job is done. Of course accuracy and reliability are important concepts (IF you can measure them) but there are always data inaccuracies - that's life sorry folks. Data is very tied to human frailties.
The important question is - where are unreliability and inaccuracy impacting your business and is it measurable in a trusted way? For example...what if 50% of my customer address state codes have garbage in them and lack accurate zip codes so they can't be contacted? You'd say that's awful. Is it? You'd tell your business about it right? We need to do something about this!! What if they come back and tell you none of those customers have ever bought anything from us and they probably never will. Are they still a big problem?
What I'm getting at is if you want to make a difference in data quality then triage data based on impact to the business and then find data quality issues that you can have high probability of affecting their bottom line and/or impact their decision making. That's not easy I know. It's a lot harder than just running some data profiles and saying look - its bad! And that's why data profiling is often where people stop. Oh and caution regarding the root cause of your data quality...it may not be a system but rather people. Data entry is most often compensated based on speed not on accuracy.
So yes accuracy and reliability are super important but so what? Can you measure it? Can you measure the impact to the business in a way they trust and care about? And even then can you influence the solution to the root cause? Does your organization have a data governance working group where these problems can be effectively addressed?
If the answer to any of these questions is "no" then you may be wasting your time worrying about it. If the answer is no and your title has data quality in it then it sucks to be you. Just get ready to run from the rocket launch pad when when you know the heat shields were designed based on bad data.
Accuracy and reliability. Who can argue with that?
But the reality is that your business doesn't give a hoot whether we reach data purity if they don't believe it impacts them. I've seen a DQ "expert" lose all credibility for yelling data quality fire. So that's why I would caution throwing around high level terms that risks thinking we can merely try to make all our data reliable and accurate (whatever that is) then our job is done. Of course accuracy and reliability are important concepts (IF you can measure them) but there are always data inaccuracies - that's life sorry folks. Data is very tied to human frailties.
The important question is - where are unreliability and inaccuracy impacting your business and is it measurable in a trusted way? For example...what if 50% of my customer address state codes have garbage in them and lack accurate zip codes so they can't be contacted? You'd say that's awful. Is it? You'd tell your business about it right? We need to do something about this!! What if they come back and tell you none of those customers have ever bought anything from us and they probably never will. Are they still a big problem?
What I'm getting at is if you want to make a difference in data quality then triage data based on impact to the business and then find data quality issues that you can have high probability of affecting their bottom line and/or impact their decision making. That's not easy I know. It's a lot harder than just running some data profiles and saying look - its bad! And that's why data profiling is often where people stop. Oh and caution regarding the root cause of your data quality...it may not be a system but rather people. Data entry is most often compensated based on speed not on accuracy.
So yes accuracy and reliability are super important but so what? Can you measure it? Can you measure the impact to the business in a way they trust and care about? And even then can you influence the solution to the root cause? Does your organization have a data governance working group where these problems can be effectively addressed?
If the answer to any of these questions is "no" then you may be wasting your time worrying about it. If the answer is no and your title has data quality in it then it sucks to be you. Just get ready to run from the rocket launch pad when when you know the heat shields were designed based on bad data.
Accuracy and Reliability
Accuracy