Mar 18 2011
Just one of the reasons I like working with CapTech is our agnostic approach to technologies. Our Data Management and Business Intelligence Practice Area includes many consultants certified with their favorite technologies. At our internal meetings, we hear some variation of “my BI technology is better than your BI technology.” Whether the topic is database platforms, data integration tools or data presentation capabilities, well-informed suggestions are freely shared only to be countered by alternative viewpoints. That sort of friendly competition keeps us on our toes.
Over the years, I personally have had the good fortune to work with many database, ETL, and presentation tools. Each of those tools impressed me in some fashion and disappointed me in some other way. No tool was perfect. Most importantly, each tool was at the mercy of the design attempting to leverage the tool.
Lately I have focused on Microsoft's Busines
Mar 16 2011
The recent tragedy in Japan got me thinking about the role of data in a disaster. How do we use data to help prepare us for these events? Are we using all the data we have available to us to understand the impact of a disaster and how are we using data after the disaster has occurred? Do we have the right data and the right tools available to save more lives or are we doing all we can?
Mar 16 2011
Let’s play a word association game. When I say “Architect” you say, what? “Buildings… Cities… Sky scrapers…?” How about “Data”?
More and more professionals who work with data are starting to see themselves as architects. A quick look in the dictionary shows that the title ‘architect’ is aligned with the design of physical structures. However, a closer look into what it takes to ‘design’ highlights the similar processes involved whether the medium is bricks and mortar or data.
Mar 15 2011
Believe it or not, most every company has some bad data somewhere in their databases. They probably have a report that has been ‘vetted’ as being absolutely correct, yet presents an incorrect picture of the truth. As a former developer for a major transportation company, and trainer/consultant for an international software company, I’ve seen some bad stuff out there. And these weren’t ‘Mom & Pop’ shops, either. Many were Fortune 500 companies (5 in the Fortune 25), federal departments and agencies, and the military. The wide-spread existence of ‘dirty data’ wasn’t too obvious to me at first. But, as time went by, I realized that everyone had bad data. Everyone.
When trying to build an OLAP cube from a bank’s data on car loans, I learned first-hand that there was more than one way to spell ‘Chevrolet’. The database held values like ‘Chevrolt’, ‘Cheverolay’, ‘chvy’, ‘Chevolet’, and more.
Mar 07 2011
In Part I of this blog, I stated that before embarking on a large scale data migration from a legacy system into a new system, it is important to establish a game plan. In Part II, once you have established the game plan, it is important to establish priorities for the Data Remediation Plan.
To build a Data Remediation Plan, you need to establish metrics by which to measure and prioritize the effort. This will be very helpful when you have conflicting resource constraints and need to know where to begin.
In Part I, I demonstrated the importance of setting priorities for data by Mandatory field, Data Retention Policy, Data Load (null records), Data Loss (truncation and duplication of records), Data Quality. Once all of these parameters have been determined, you are now left with the records that need to be remediated. I will use a fictitious metrics table to provide examples of how to set up the priority for tackling each one.