Data Management

Jan 22 2012

A QlikView QuickStart: first steps for learning QlikView desktop

QlikTech’s QlikView reporting and analysis tool is among a new class of Business Intelligence (BI) software tools. As Ben Harden reported in a recent blog post, BI vendors like SAP, Microsoft, and IBM have traditionally sold “to the IT enterprise, but companies like QlikTech and Tableau are targeting the business and bypassing IT. Their tools are quicker to stand up, more intuitive and don’t need the configuration, support, and hardware that the bigger players require.”

A Quick Overview

At first look QlikView is fairly accessible to those experienced with BI tools. A “.qvw” QlikView file contains three classes of user-facing components: a script-based data integration language that runs when the user requests a “reload”, a data modeling component that looks deceptively like a relational data modeling tool, and a familiar array of data visualizations: graphics, charts, lists, etc.

Read More

Dec 14 2011

Project reviews, deliverables’ reviews and constructive criticisms

Project managers often face tough times when they need to pull the strings that may make others uncomfortable. Those mainly include conducting project reviews for balancing the triple constraints, facilitating the reviews of the deliverables for quality or compliance, and providing constructive criticism to the team members for corrective or preventive actions.

Everyone likes to hear or deliver good news, however good project managers are expected to be the experts at relaying when things don’t go as planned, typically coupled with a pro-active solution.

The following are the most successful approaches good project managers often practice.

Read More

Nov 21 2011

Data Quality 3: No More Data Corruption Excuses!

I’ve recently posted a couple of articles at this site on data quality, this is the final one in a series of three.  Previous posts presented these ideas:

  • Yes, there is a business case for improving data quality, and we’ve got business value examples. If you look for real money where you anecdotally know there are data quality problems, you’ll likely find it in high costs of data correction and rework, and savings related to business process improvements that reliable data enables.
  • There are distinct things an organization can do to reap benefits of improved data management and data quality.  (1) Get started in the first place, (2) find the tangible benefits, (3) cross the departmental silos that exist in every large organization, and (4) promote sound data management practices.
  • Impacts of poor data quality can seem abstract in a large organization. They aren’t for a small business owner.  Say you have a carpet cleaning business.  What if you knew 10% of your customer bills were wrong, but you weren’t sure by how much or in which direction?  First you’d panic. Then you’d rush to fix the problem.

Read More

Nov 11 2011

Data management success means overcoming key challenges

In my experience there are a few consistent themes that emerge in data management and data governance work. Despite diversity of industry, culture and size, our clients face four common challenges in efforts to establish effective data management.

To paraphrase the DAMA Guide to the Data Management Body of Knowledge (DMBOK), data management means understanding enterprise data needs; collecting, storing, and protecting data, continually improving data quality, maintaining data security, and maximizing effective use and value of data assets.

Challenge #1: Get started

Read More

Oct 03 2011

2011 Teradata Partners Conference Day 1

Day 1 of the 2011 Teradata Partners User Group Conference and Expo at San Diego was very educational and I had the opportunity to listen to some very good sessions. During the morning session, I attended a  workshop on “Teradata Live”, an End to End demo by Lance Miller from Teradata. This demo was done on a 8GB Memory,2TB, TD 13.10,4AMP configuration. Lance showed a logical data model for retail customers and the data distribution for 150 million row sales table. He pulled the Teradata Administrator view and showed how the data was evenly distributed on 4 amps. Lance also demonstrated the Teradata Miner 5.0  which is used for data profiling. During this session I asked a question on whether they have any data to support benchmarks against competitors either IBM's Netezza or Oracle Exadata.

Read More

Jun 21 2011

Health care data security: how bad is it?

It is really bad, according to a recent survey by the Ponemon Institute (available here with registration). The white paper, entitled Health Data at Risk in Development: A Call for Data Masking, presents the results of a survey of 492 health care IT professionals on their companies’ practices regarding use of live personal health care data in application testing.

It makes a scary read.  Here are the lowlights:

Read More

May 31 2011

Thoughts after agile training: strengthening values, reducing the cost of honesty, and growing apps

I recently completed ScrumMaster training ably presented by Lyssa Adkins. Throughout the two-day class we appreciated Lyssa’s Zen-like, enabling, style. If her name is familiar, it’s because Ms. Adkins is the author of the book Coaching Agile Teams, one of the leading texts on the subject.

I’ve participated on agile projects, but so far only in a piggish/chickenish role, once in a three-week stint as a consulting architect and twice as the project manager serving as interface to the non-agile organization. To me Ms. Adkins rocks at making students very introspective and critical of their past project experiences.  These lessons stand out:

Read More

Mar 25 2011

Data quality and data governance lessons from national health care

Who would want to be a national health care administrator?  Who would want the responsibility for managing health care and formulating health policy for tens or hundreds of millions of people?  It seems obvious that such decisions would rely on quality data.  A recent interview impressed upon me how much data managers can learn from a field where data recording millions of separate life and death decisions aggregates to support decisions on the future allocation of health care resources.

Heather Richards of the Canadian Institute for Health Information (CIHI) was recently interviewed by the Australian magazine Image and Data Manager on CIHI efforts to provide neutral, objective and unbiased information to those making health care allocation policy decisions. Ms. Richards also happens to be Director of Publicity for the International Association for Information and Data Quality (IAIDQ).

In a detailed, concise, and refreshingly buzzword-free conversation, Ms. Richards described CIHI’s approach to improving data quality.  To me, that approach boils down to these three themes:

Read More

Jan 19 2011

Informatica Cloud Express - the Data Integration Software as a Sevice (SaaS) on Cloud

At first glance, hosting applications in the cloud seems to require less setup and operational cost, which accounts for its great appeal. That said, the world is still trying to make sense of how cloud computing can really fit into the enterprise computing puzzle in the midst of so many of the complex challenges we face, like the ever increasing needs for security, continuing dependency on age old legacy systems, ever increasing uncertainty in the current troubling global economic situation, and more. Nevertheless, one thing that is quite clear to me is that cloud computing during this economic uncertainty does offer a great new hope for organizations if only for its “pay as you go” approach. 

Read More

Jan 12 2011

DMBI Tech Tip: Include Audit Columns on all Tables

When I design a database, and local standards permit, I include both a surrogate primary key and audit columns on every table. 

A surrogate primary key is a system-generated integer that increments by one with each new row inserted.  Most DBMSs make it easy to add surrogate keys.  Oracle uses a construct called a Sequence, SQL Server calls its variant an Identity. 

The audit columns I like to include are these, shown with their SQL Server datatypes:  CreationDatetime (datetime), UpdateDatetime (datetime), CreationUserId (varchar (30)), and UpdateUserid (varchar (30)).  Most DBMS’s offer the ability to update columns like these with either default values or triggers.

Read More

 

Disclaimer

The words and opinions expressed here are those of each article's respective author, and do not necessarily represent the views of CapTech Ventures.