Unpivot in SQL Server Integration Services (SSIS)

Jan 26 2010

 

When importing data from a flat file into a database it is not uncommon to be presented with a source that has repeating data sets.  This data isn't normalized and usually needs to be split into a single row for each repeating group.  Moving multiple matching sets of columns into rows can easily be done using the Unpivot transformation in SQL Server Integration Services (SSIS).  For example, the following data is presented containing 2 columns that apply to every field (Date and Type), followed by 5 groups of 3 columns (Salesperson, Sales and Units).

  

  

Below is a truncated version of the source file.  The entire file can be found here.


 

 
     

The data is to be placed into a normalized table containing Date, Type, Salesperson, Sales, and Units.  Additionally, to identify from which source column set each new target rows is derived, we will include a sequence number in the target.  For this example I’ll be using a comma delimited flat file as the data source.  On the Error Output tab of the Source Editor I’ve also set Truncation to Ignore failure.

  

  

 

 

  

Drag the Unpivot transformation from the tool box into the work area and double click it to open the transformation editor.  Select each of the columns that will be pivoted as an input column, leaving the columns Type and Date as Pass Through Columns.

 

 

 

  

Fill in each destination column field with the column that the data will be pivoted into.  The key to correctly pivoting sets of columns is to properly populate the Pivot Key Value field.  Each set must be assigned its own unique value.  It is also advisable to assign that field a name in the Pivot key value column name field.

  

 

  

You can see that after the Unpivot transformation we no longer have 17 columns, but have reduced to 6; the 2 standard columns, the sequence number, and the 3 unpivoted columns.  Map them to the appropriate Destination Column.

  

 

 


  

The resulting data is now in our normalized format.

  

 

Type

SEQ

Salesperson

Units

Sales

1/1/2009

Software

1

Clark Kent

10

1

1/1/2009

Software

2

Selina Kyle

1000

100

1/1/2009

Software

3

Tony Stark

100000

10000

1/1/2009

Software

4

Peter Parker

1000000

100000

1/1/2009

Software

5

Barry Allen

100000000

10000000

1/1/2009

Processor

1

Clark Kent

12

2

1/1/2009

Processor

2

Selina Kyle

1002

102

1/1/2009

Processor

3

Tony Stark

100002

10002

1/1/2009

Processor

4

Peter Parker

1000002

100002

1/1/2009

Software

5

Oliver Queen

100000006

10000006

1/1/2009

Processor

1

Bruce Wayne

18

5

1/1/2009

Processor

2

Diana Prince

1008

108

1/1/2009

Processor

3

Simon Templar

100008

10008

1/1/2009

Processor

4

Arthur Curry

1000008

100008

1/1/2009

Processor

5

Oliver Queen

100000008

10000008

1/1/2009

Peripheral

1

Bruce Wayne

20

6

1/1/2009

Peripheral

2

Diana Prince

1010

110

1/1/2009

Peripheral

3

Simon Templar

100010

10010

1/1/2009

Peripheral

4

Arthur Curry

1000010

100010

1/1/2009

Peripheral

5

Oliver Queen

100000010

10000010

1/1/2009

Software

1

Barbara Gordon

22

7

1/1/2009

Peripheral

2

Tony Stark

1016

116

1/1/2009

Peripheral

3

Hal Jordan

100016

10016

1/1/2009

Peripheral

4

Bruce Banner

1000016

100016

1/1/2009

Peripheral

5

John Jones

100000016

10000016

2/1/2009

Software

1

Clark Kent

28

10

2/1/2009

Software

2

Selina Kyle

1018

118

2/1/2009

Software

3

Tony Stark

100018

10018

2/1/2009

Software

4

Peter Parker

1000018

100018

2/1/2009

Software

5

Barry Allen

100000018

10000018

  

In a future post, I’ll cover some of the reasons for using an Unpivot transformation instead of a Script Task or a Multicast into a Union All which could yield the same results.

 

 

AttachmentSize
sample.txt1.54 KB

About the Author

Disclaimer

The words and opinions expressed here are those of each article's respective author, and do not necessarily represent the views of CapTech Ventures.