Data Quality
Nov 21 2011
Data Quality 3: No More Data Corruption Excuses!
I’ve recently posted a couple of articles at this site on data quality, this is the final one in a series of three. Previous posts presented these ideas:
- Yes, there is a business case for improving data quality, and we’ve got business value examples. If you look for real money where you anecdotally know there are data quality problems, you’ll likely find it in high costs of data correction and rework, and savings related to business process improvements that reliable data enables.
- There are distinct things an organization can do to reap benefits of improved data management and data quality. (1) Get started in the first place, (2) find the tangible benefits, (3) cross the departmental silos that exist in every large organization, and (4) promote sound data management practices.
- Impacts of poor data quality can seem abstract in a large organization. They aren’t for a small business owner. Say you have a carpet cleaning business. What if you knew 10% of your customer bills were wrong, but you weren’t sure by how much or in which direction? First you’d panic. Then you’d rush to fix the problem.
Nov 11 2011
Data management success means overcoming key challenges
In my experience there are a few consistent themes that emerge in data management and data governance work. Despite diversity of industry, culture and size, our clients face four common challenges in efforts to establish effective data management.
To paraphrase the DAMA Guide to the Data Management Body of Knowledge (DMBOK), data management means understanding enterprise data needs; collecting, storing, and protecting data, continually improving data quality, maintaining data security, and maximizing effective use and value of data assets.
Challenge #1: Get started
Oct 11 2011
The business value of data quality
Imagine if you owned a restaurant, and you found out that about 10% of customer checks didn’t match up with orders placed through the kitchen. You’d quickly ask tough questions: Is someone stealing money? Are customers being cheated? What’s causing the errors? After a quick assessment you would take quick action to correct the problem and make sure it never happens again.
Strangely, that kind of awareness of data quality doesn’t seem to scale up to large organizations. When data management teams contact CapTech for help, they routinely recount challenges in funding data quality work. They ask for simple, direct examples showing tangible business benefit from improving data quality.
Here are three of our favorites:
Sep 15 2011
Quantity vs Quality: Reports that are gluttons for data
Interestingly enough, this thought comes to me as I'm devouring my dinner at a local buffet-style restaurant establishment. Maybe it's due to the influence of the Food Network, but it seems as if I have recently trended towards explaining technical concepts with food analogies.
Quality data fulfills the business' needs. Where as, mass quantity of data may satisfy the end-user's wants. Of course there are trade-offs!
Sure, while standing in the buffet line, if I'm asked what I want, I'll ask to have food items piled on to my plate overflowing onto my tray. However, when I sit to consume the food (read: querying data), it takes longer to load it all off of my plate and into my mouth (read: generating report), to the extent that I'm becoming groggy and sluggish (read: lag in rendering output).
Apr 25 2011
Data Quality – Everyone is a Stakeholder
I have been wondering when we as a society realize our stake on effective implementation of data quality assurance. We “the people” systematically learn to accept the consequences and fallacies in data and information mismanagement.
We learned to live within this system of errors, without even thinking much about how we can complain (let alone demand the organizations to take responsibility) for their data errors. The following scenario (which is still unresolved) can put our helplessness and/or stigma into a proper perspective.
Four years ago, someone I know got a call from a collection agency twice in a row for two days. When he answered the call, they said he missed paying a hospital bill and his credit would be affected if he did not pay immediately. He paid the amount they said he owed and was told it would not hurt his credit scores. He was given very little information at that time.
Mar 25 2011
Data quality and data governance lessons from national health care
Who would want to be a national health care administrator? Who would want the responsibility for managing health care and formulating health policy for tens or hundreds of millions of people? It seems obvious that such decisions would rely on quality data. A recent interview impressed upon me how much data managers can learn from a field where data recording millions of separate life and death decisions aggregates to support decisions on the future allocation of health care resources.
Heather Richards of the Canadian Institute for Health Information (CIHI) was recently interviewed by the Australian magazine Image and Data Manager on CIHI efforts to provide neutral, objective and unbiased information to those making health care allocation policy decisions. Ms. Richards also happens to be Director of Publicity for the International Association for Information and Data Quality (IAIDQ).
In a detailed, concise, and refreshingly buzzword-free conversation, Ms. Richards described CIHI’s approach to improving data quality. To me, that approach boils down to these three themes:
Mar 15 2011
Everyone’s Dirty Little Secret
Believe it or not, most every company has some bad data somewhere in their databases. They probably have a report that has been ‘vetted’ as being absolutely correct, yet presents an incorrect picture of the truth. As a former developer for a major transportation company, and trainer/consultant for an international software company, I’ve seen some bad stuff out there. And these weren’t ‘Mom & Pop’ shops, either. Many were Fortune 500 companies (5 in the Fortune 25), federal departments and agencies, and the military. The wide-spread existence of ‘dirty data’ wasn’t too obvious to me at first. But, as time went by, I realized that everyone had bad data. Everyone.
True Stories
When trying to build an OLAP cube from a bank’s data on car loans, I learned first-hand that there was more than one way to spell ‘Chevrolet’. The database held values like ‘Chevrolt’, ‘Cheverolay’, ‘chvy’, ‘Chevolet’, and more.
Mar 03 2011
Data Profiling and Data Remediation Priority for Migrating Legacy Systems Part I
Before embarking on a large scale data migration from a legacy system into a new system, it is important to establish a game plan. Part of the game plan is to profile the data that will be migrated. Unfortunately, too many times I see Data Analyst diving into the source database and begin writing extensive SQL queries that provide various permutations of data profiling reports before determining if the data is even relevant to the new system. I believe that by having a game plan for data profiling and subsequently, establishing a remediation plan, the organization will save a lot of time and money.
Feb 21 2011
DM/BI Tech Tip: Nail Down The Data Details in Your Requirements
One of the things I’ve learned the hard way is to include a checklist of data-specific criteria in your Data Requirements document.
We have all heard the question come in from a developer asking about formatting of numeric values. “How many decimal places?” or “Does this value have a leading dollar sign?” The Data Requirements section of most Requirements Documents lists the name, description, data type and other characteristics of specific data elements. It defines whether each data element is a character (and its length) or numeric, and whether NULL’s are allowed.
Jan 27 2011
Slowly Changing Dimensions – Special Attention Needed
Margaret, who was an average sales person, moved from Washington, DC to Richmond, VA, whose market is one fifth the size, during the month of June. When the annual evaluations of sales performance were done in the month of December, she was listed as the top performer in the Richmond market resulting in the company promoting her to Sales Director. The next two highest ranked Richmond salespeople had been the consistent leaders for the last several years and outperformed Margaret since she arrived in Richmond. Her very high sales numbers during the first six months of the year skewed her average, placing her above the rest of the Richmond area. In this example, if the decision makers had correct information handy, and used it appropriately, would they have promoted Margaret over her new Richmond peers?
Here is another example.
- 1
- 2