Database Design
Jan 22 2012
A QlikView QuickStart: first steps for learning QlikView desktop
QlikTech’s QlikView reporting and analysis tool is among a new class of Business Intelligence (BI) software tools. As Ben Harden reported in a recent blog post, BI vendors like SAP, Microsoft, and IBM have traditionally sold “to the IT enterprise, but companies like QlikTech and Tableau are targeting the business and bypassing IT. Their tools are quicker to stand up, more intuitive and don’t need the configuration, support, and hardware that the bigger players require.”
A Quick Overview
At first look QlikView is fairly accessible to those experienced with BI tools. A “.qvw” QlikView file contains three classes of user-facing components: a script-based data integration language that runs when the user requests a “reload”, a data modeling component that looks deceptively like a relational data modeling tool, and a familiar array of data visualizations: graphics, charts, lists, etc.
Aug 30 2011
Abstracting and recombining all the way to the bank
In the past I've never understood what people really mean they say "think outside the box" but Jim Harris, in a recent OCDQ blog post, helped me figure it out.
Mr. Harris ends with this provocative line: "the bottom line is Google and Facebook have socialized data in order to capitalize data as a true corporate asset." The post starts with a cold war analogy and proceeds to describe how Facebook and Google have made big money as "internet advertising agencies:" offering free services with which users (like us) serve up personal data in return for use of the service, then selling advertising space based on our data (hopefully anonymized).
Jul 25 2011
Get an early start for on-time data modeling
I’m a data modeler, so I enjoyed Jonathon Geiger’s recent article entitled “Why Does Data Modeling Take So Long”. But why does he say it like it’s a bad thing?
Mr. Geiger’s bottom line is exactly right: “Most of the time spent developing data models is consumed developing or clarifying the requirements and business rules and ensuring that the data structure can be populated by the existing data sources." On the projects he describes, no one took time before modeling to determine available data sources and identify business entities of interest, relationships among them, and attributes that describe them before database design started, so the data modeler had to do it.
Jan 27 2011
Cost of Convenience
A colleague introduced the term “convenience view” to me and that term resonated with me ever since. A convenience view is one of those database objects intended to make life easier for people to access data without actually understanding the nuances and relationships of those data. Convenience views frequently join multiple tables together so data users will not need to code or optimize those joins. Convenience views may also include business logic which transforms data so end users will not need to code or argue those transformations. The concept seems noble enough. Who opposes simple data access, optimized joins and centralized business rules? Just like that store that sells everything right off the interstate, convenience comes at a cost.
One obvious cost is all of those optimized joins. Sure each individual join may be optimized, but the database engine has to collectively consider all available join options before an execution
Jan 12 2011
DMBI Tech Tip: Include Audit Columns on all Tables
When I design a database, and local standards permit, I include both a surrogate primary key and audit columns on every table.
A surrogate primary key is a system-generated integer that increments by one with each new row inserted. Most DBMSs make it easy to add surrogate keys. Oracle uses a construct called a Sequence, SQL Server calls its variant an Identity.
The audit columns I like to include are these, shown with their SQL Server datatypes: CreationDatetime (datetime), UpdateDatetime (datetime), CreationUserId (varchar (30)), and UpdateUserid (varchar (30)). Most DBMS’s offer the ability to update columns like these with either default values or triggers.
Aug 19 2009
Intial thoughts on cloud databases
Recently there’s been a lot of discussion about alternatives to the standard relational DBMS (note this widely read article). Some of the feedback comes from OO developers. There’s a fundamental dissonance between OO and relational approaches, requiring an intermediate object/relational mapping (ORM) layer for OO systems to operate effectively with relational DBMSs. Others decry the overhead imposed by the relational DBMS, seeking a way to store and use massive datasets for applications needing fast response. In response a number of vendors are now providing SaaS-like database services in the “cloud”: open source, lightly structured data services provided via the internet, capable of storing and delivering large data stores for high availability, fast response applications.
Vendors have only recently arrived on the scene with these offerings, but according to Kevin Hazzard, CapTech architect, this was inevitable: “In my own writings and talks, I sometimes call databases an unfortunate consequence of history. This sentiment isn’t borne of any sort of animosity toward databases. On the contrary, I can’t imagine doing what I do without the services that databases provide. However, I mustered just enough intellectual honesty to admit that if I were able to give a handful of 4Gb DIMMs to the creators of the ENIAC computer at University of Pittsburgh in 1948 and if I were smart enough to show them how to use that memory effectively, the database as we know it today would simply never have evolved. Instead, databases would have become an integral part of the software development model. The database would be the file system with some automatic hashing and locking capabilities.
“Well, that sounds an awful lot like Amazon’s SimpleDB, Google’s BigTable (and Hypertable), Microsoft SQL Data Services (in the cloud), and other emerging DB platforms. The ORM trend is also pushing developers toward these models by providing language extensions for query comprehension, query optimization and parallelism in the application domain. Is there a chance that pre-relational or post-relational database engines will eclipse relational databases in the marketplace in the near-term? No way. “
Still, Data Management professionals should be on the lookout for bringing competitive edge to their BI environment with these new approaches – but carefully. According to CapTech DM/BI specialist Ashwini Kumar, these new database approaches are “a good fit for simple data structures, structured data, and applications with queries using procedural codes. Also, they may be good fit for customized applications. However, Business Intelligence/Enterprise Information Management type applications are more dynamic in nature, especially in exploring or mining data.”
So where to from here? Over the coming months we’ll watch as these new database technologies emerge into wider commercial application and lessons-learned emerge.