Thursday 4 December 2014

Unpivot Transformation



        Unpivot is the opposite to Pivot transformation. Using Unpivot transformation we can interchange the columns into rows.

Example:
Source table:


CREATE TABLE UNPIVOT_SOURCE
(
YEARS INT,
SOAP NUMERIC(12,4),
BRUSH NUMERIC(12,4),
PASTE NUMERIC(12,4)
)

INSERT INTO UNPIVOT_SOURCE VALUES
(2010,3500.25,3201.45,2100.61)
,(2011,2842.15,3654.75,2236.24)
,(2012,2689.26,3214.23,2432.55)

 
Target table:
CREATE TABLE UNPIVOT_TARGET
(
YEARS INT,
PRODUCT NVARCHAR(30),
TOTAL_PRICE NUMERIC(12,4) 
)


SSIS Package:
  • Connect to the SQL Server and create the above tables.
  • Open SSIS Package, In 'Control Flow' drag and drop 'Data flow task' from Toolbox.
  • Go to Data flow and drag and drop 'OLE DB Source' and configure with source table.
  • From Toolbox drag and drop 'Unpivot Transformation', and connect 'OLE DB Source' green arrow.
  •  Edit the 'Unpiot transformation'. Select the columns check box, which we are going to convert as rows (Soap, Brush and Paste).
  • In the above screen we can see the 'Input Column', 'Destination Column' and 'Pivot key value'. In the 'Destination Column' section we need to give destination column name(TOTAL_PRICE). The source columns 'Soap, Brush and Paste are having total cost information, this information is related to 'TOTAL_PRICE' (destination column).
  • In the bottom of the above screen we can see the 'Pivot key value column name' text box. There we need to give the 'Name' for the new Unpivot column (PRODUCT).
O/P:

Tuesday 2 December 2014

Pivot Transformation



        Using Pivot transformation we can interchange the rows into columns.

Example:
Source table:
CREATE TABLE PIVOT_SRC
(
YEARS INT,
PRODUCT VARCHAR(30),
TOTAL_PRICE NUMERIC(12,4)
)

INSERT INTO PIVOT_SRC VALUES
(2010,'SOAP',3500.25)
,(2010,'PASTE',2100.61)
,(2010,'BRESH',3201.45)
,(2011,'SOAP',2842.15)
,(2011,'PASTE',2236.24)
,(2011,'BRESH',3654.75)
,(2012,'SOAP',2689.26)
,(2012,'PASTE',2432.55)
,(2012,'BRESH',3214.23)
Target table:


CREATE TABLE PIVOT_TARGET
(
YEARS INT,
SOAP NUMERIC(12,4),
BRESH NUMERIC(12,4),
PASTE NUMERIC(12,4)
)


SSIS Package:
  • Connect to the SQL Server and create the above tables.
  • Open SSIS Package, In 'Control Flow' drag and drop 'Data flow task' from Toolbox.
  • Go to Data flow and drag and drop 'OLE DB Source' and configure with source table.
  • From Toolbox drag and drop 'Pivot Transformation', and connect 'OLE DB Source' green arrow.
  • Edit the 'Pivot Transformation'.
  • Go to 'Input Columns', Select required 'Input Columns' check boxes.
  • Go to 'Input and Output Properties'
  • Expand the 'Pivot Default Input' and 'Pivot Default Output', Using '+' sign.
  • In the above screen shot we can see the input columns YEARS,PRODUCT AND TOTAL_PRICE. For these columns we need to set the 'PivotUsage' property value based on column usage.



Pivot
Usage
key
Description
Sample Source
columns
0
The column will passed through unaffected

1
The column values become the rows of the pivot
YEARS
2
The column values become the column names of the pivot
PRODUCT
3
The column values that are pivoted in the pivot
TOTAL_PRICE
 
  • Based on Source column usage, i have assigned 'PivotUsage' values. (Years-1,Product-2 and Total_Price-3)
  • Click on 'Output Columns' and add New columns using 'Add Column' Button. Rename the New columns.
  • After creating output columns, We need to assign 'PivotKeyValue and SourceClumn' Property values for each 'Pivot Default Output' columns. Then only mapping will create from input columns to the output columns.
  • 'YEARS' output column values should come from 'YEARS' input column. So We need to assign the input column 'LineageID' values to the output column "SourceClumn property", and output column name to the output column 'PivoteKeyValue'.
  • Same way we need to map remaining columns.


Output Column name
Mapping Input column
Years
PioveteKeyValue: Name of the Output Column (Years)
SourceColumn: LinageID of the 'YEARS' column
Soap
PioveteKeyValue: Name of the Output Column (Soap)
SourceColumn: LinageID of the 'TOTAL_PRICE' column
Paste
PioveteKeyValue: Name of the Output Column (Paste)
SourceColumn: LinageID of the 'TOTAL_PRICE' column
Brush
PioveteKeyValue: Name of the Output Column (Brush)
SourceColumn: LinageID of the 'TOTAL_PRICE' column


O/P:





Monday 1 December 2014

Script Task




Source file existence check using Script task:
         If the source file is not available then ETL execution will fail. For avoiding this failure, we can implement a task to check the source file existence, based on source file availability we can take further action.

Example:     Step 1:
  • Create the below three variables
    • Name: SourceDirectory     Data type: String     Value: D:\Source\ (Source directory)
    • Name: SourceFileName     Data type: String     Value: Customer.txt (Source file name)
    • Name: FileExistence           Data type: String     Value: False
      Step 2:
  • From Toolbox drag and drop 'Script task'.
  • Edit the script task and select ScriptLanguage as "Microsoft Visual Basic". Select 'SourceDirecotry' and 'SourceFileName' variables in the "ReadOnlyVariable" list. Select 'FileExistence'  variable from the "ReadWriteVariables" list.
  • Click on "Edit Script..." button. Vb script will open, used the below script to check source file availability.


        dim FullPath = Dts.Variables("User::SourceDirectory").Value.ToString() + Dts.Variables("User::SourceFileName").Value.ToString()
        If System.IO.File.Exists(FullPath) Then
            Dts.Variables("User::FileExistence").Value = "True"
        Else
            Dts.Variables("User::FileExistence").Value = "False"
        End If

 
  • Save the script. Close the 'Integration Service Script Task' window.
  • Click on 'Script Task Editor' window 'OK' button.
  • From Toolbox drag and drop two 'Data flow tasks'. 1st task will run if source file is present. If source file is not present second task will run. For this process i am going to write expressions.
  • Connect the Script task green arrows (precedence constraint) to the Data flow tasks.
 
      Step 3:
  • Right click on the first 'Precedence Constraint' and select 'Edit' option.
  • 'Precedence constraint Editor' window will open.
  • Select 'Expression and Constraint' from the 'Evaluation operation' drop-down list.
  • Writhe the expression @[User::FileExistence] == "True"  in the 'Expression' text box.
 
  •  Same way write expression (@[User::FileExistence] == "False" ) for second 'Precedence constraint'
 
      Step 4:
  • Now the Package is ready. Source file is present, so if i run the package, first 'Data flow task' should run.
  • In the above screen we can see that the first 'Data flow task' is executed.
  • Now i have removed the source file. If i run the ETL second 'Data flow task' should run.
  • In the above screen we can see that the second Data flow task is executed. So source file is not present.