Teradata

Dec 07 2011

Addressing Slowly Changing Dimensions with Teradata v13

Earlier in my blog, Slowly Changing Dimensions – Special Attention Needed, I touched upon the need to pay special attention to slowly changing dimensions. Organizations have three variants of implementing solutions for slowly changing dimensions.


Type 1: in these implementations, the latest data is retained. This is implemented when there would be no need to do historic analysis. For example, an online transactional system that needs to display the latest list of values in the pull-down pick lists may use this type.


Type 2: in these implementations the history or the validity period for the changes is persisted.

Read More

Oct 04 2011

2011 Teradata Partners Conference Day 2

The second day of the 2011 Teradata Partners Conference started with Teradata business leaders talking about how the company is growing and investing . Mike Koehler, Teradata's CEO, described the new line of products, in particular ClaraView, Aprimo and Asterdata. Mike showed some quick statistics on the explosion in digital data from mobile, social, and sensor. The trends which  really drove his point were the increase in the Teradata’s Peta Byte Club and the 2011 IDC Digital Universe Study (sponsored by EMC) showing that in 2011 1.8 ZettaBytes (1.8 Trillion GigaBytes) of information will be created and replicated.

This was closely followed by Steven Levitt's session. The author of popular books "Freakonomics" and "Superfreakonomics," described his moments of truth in learning calculus and other interesting facts .

Read More

Oct 03 2011

2011 Teradata Partners Conference Day 1

Day 1 of the 2011 Teradata Partners User Group Conference and Expo at San Diego was very educational and I had the opportunity to listen to some very good sessions. During the morning session, I attended a  workshop on “Teradata Live”, an End to End demo by Lance Miller from Teradata. This demo was done on a 8GB Memory,2TB, TD 13.10,4AMP configuration. Lance showed a logical data model for retail customers and the data distribution for 150 million row sales table. He pulled the Teradata Administrator view and showed how the data was evenly distributed on 4 amps. Lance also demonstrated the Teradata Miner 5.0  which is used for data profiling. During this session I asked a question on whether they have any data to support benchmarks against competitors either IBM's Netezza or Oracle Exadata.

Read More

Apr 07 2011

Teradata Indexing

One of the challenges design teams face when implementing Teradata design solutions is choosing indexes which help improve performance. What happens if the primary index is not the same as the primary key, and how does one validate that the design decision is correct?

The primary key of a relational table uniquely identifies each record in that table. A logical data model should specify the primary key (PK). If this specification is absent or not clear, the data modeler must provide this information. Indexes are physical design concepts and are related to data base performance. PKs are used for referential integrity, for logical correctness within the data model, and may be NULL. Primary indexes (PI), on the other hand, a physical mechanism for storage, are defined by the SQL create table, may be unique or non-unique, and can allow for changes in the data.

Read More

Feb 06 2011

What to Make of the Latest Gartner Magic Quadrant for Business Intelligence Platforms?

Gartner recently released an updated version of the Magic Quadrant for Business Intelligence Platforms. One quick glance at the new Magic Quadrant and you’ll notice there are a whopping 19 entries this year, with 8 companies making the prestigious leader quadrant.  When compared to the 4 leaders in the Data Integration magic quadrant and 6 in the data warehouse space it is clear there is a lot happening in the world of Business Intelligence.

Read More

Jan 27 2011

Cost of Convenience

A colleague introduced the term “convenience view” to me and that term resonated with me ever since.  A convenience view is one of those database objects intended to make life easier for people to access data without actually understanding the nuances and relationships of those data.  Convenience views frequently join multiple tables together so data users will not need to code or optimize those joins.  Convenience views may also include business logic which transforms data so end users will not need to code or argue those transformations.  The concept seems noble enough.  Who opposes simple data access, optimized joins and centralized business rules?  Just like that store that sells everything right off the interstate, convenience comes at a cost.

One obvious cost is all of those optimized joins.  Sure each individual join may be optimized, but the database engine has to collectively consider all available join options before an execution

Read More

Feb 15 2010

Using the SQL RANK function to solve an unusual sorting problem in obtaining data for performance tests

Problem:

Our team needed to create a data set listing retail stores belonging to different territories for performance tests that involved store selection.  The data set was to be used by virtual users signing-in as different territory owners.  The challenge was that the territory and store numbers in the data set should be as random as possible.  For example, if there are 20 territories and each territory has 10 stores, the data set should contain the 20 territories with their first stores, followed by the same 20 territories with their second stores and so on.  

Read More

 

Disclaimer

The words and opinions expressed here are those of each article's respective author, and do not necessarily represent the views of CapTech Ventures.