Learn MSBI Part 1:- Creating SQL tables, Practicals of SSIS-ETL(Extraction, Transformation & Loading)

Ahteshamax
Posted by in MSBI (SSIS, SSRS, SSAS) category on for Beginner level | Points: 250 | Views : 7789 red flag
Rating: 5 out of 5  
 1 vote(s)

In this article we will learn creation of SQL tables and will do practicals of SSIS to achieve ETL(Extraction, Transformation & Loading).

Links to other Labs of Learn MSBI Step by Step.

Learn MSBI Part 2:- Performing Validation on Records

Learn MSBI Part 3:- SCD (Slow Changing Data)

Learn MSBI Part 4:- Fact and Dimension.

Learn MSBI Part 5:- CUBE Creation

Learn MSBI Part 6:- Displaying reports using SSRS

Learn MSBI Part 7:- DQS (Data Quality Services)

Learn MSBI Part 8:- Data Profiling Task

Learn MSBI Part 9:- Pivot, UnPivot and Aggregation

Learn MSBI Part 10:- Execute SQL Task component (SSIS)

Learn MSBI Part 11:- Send Mail Task & Script Task Component

Learn MSBI Part 12:- CDC Concept using SSIS

Learn MSBI Part 13:- Asynchronous, Synchronous, Full-blocking, Semi-blocking & No-blocking

Learn MSBI Part 14:- Understanding Page Split and maintaining SSIS Performance

Learn MSBI Part 15:- Debugging in deployment – Instrumentation using Data Taps

Learn MSBI Part 16:- Using Multi-threading and its performance in SSIS program


This article created considering that the reader has already installed SQL Server 2014 and SSDT with VS 2013 in his/her system.

Changing article track

In case if you are very new to installations and from where to do download of software which is reliable etc questions which are arising in your mind do not worry it is a very but obvious question as a new comer, now just relax down because we are going to give a good news that we have already created an article which depicts steps by steps download of SQL Server and SSDT -VS 2013 along with installation steps.

Getting back to topic

In this article we will be concentrating on creating SQL tables into SQL Server where we will load values of CSV files into database table using SSIS(SQL Server Integration Services) through its process of ETL looking into detailed each step by step practicals of ETL(Extraction, Transformation & Loading) and forming a simple project with SSIS.

Before we proceed with creation of SSIS project let us first quickly go through about theory of SSIS.

SSIS(SQL Server Integration Service): Under SSIS, raw data or raw information is taken from CSV form, excel or Microsoft access and stored in Dataware house. SSIS takes care of raw data (data source) and stores it in Dataware house which is not very easy it has to go through ETL(Extraction, Transformation and Loading) process.


While going through the journey of process called ETL: –

  1. Extraction from raw data is done it first identifies whether raw data is CSV file, binary, SQL file or Microsoft Access. Once the identification of data source is done then it’s Transformation is done.
  2. In the Transformation process various data type used in raw data are brought into common data form in order to avoid confusion.
  3. Once the Extraction and Transformation process is done then loading of that data source is done into DataWare house.

Good…! Now we are familiar with SSIS which is all about ETL (Extraction, Transformation & Loading) so next we will start its practical.

Simple project with SSIS

With SSIS project we will upload CSV file into our Data ware house through ETL process: –


Before we proceed with ETL process in SSDT we will first get familiar with creating tables of SQL Server so below are the steps which will include creation of table in SQL Server.

Creating Tables on SQL Server

Create Tables on SQL Server under Databases. Open SQL Server Management studio using Start Menu program as you see in the following image snapshot.


Once SQL Server Management Studio is opened, on left side you will see Object Explorer which is currently empty because we will still not connected to the server.

And it will have a window which will ask you to connect server.

Please Note: – For now keep settings set to its default values.

  • Server type: Database Engine
  • Server Name: it will take your system name or you can set new name while installing SQL Server.
  • Authentication: By default it is Windows Authentication so keep it as it is.

After fields are set from dropdown and Server Name is given then click on “Connect” in order to connect to SQL Server.


As soon as Management Studio is connected to SQL Server under Object Explorer you will see Server with hierarchy which contains many folder and essentials SQL database file entities as you see in the following image: –


Now do right click on Database and create New Database


You will see following screen to create New Database under which give a suitable to name to create New Database here we have given name as “CustomerDataWareHouse” and then click ok.


Once Database is created with name “CustomerDataWareHouse” now create a New Table by expanding Tables folder as shown in the following image


After table is opened on the right side create Column Name of the table as per the input CSV file and set Data Types as per Columns

CustomerCode –> nvarchar(50) (so that CustomerCode can be set with alphanumeric & numeric)

CustomerName –> nvarchar(50)

CustomerAmount –> money (so that CustomerAmount can be set to currency)

SalesDate –> datetime (so that SalesDate can be set according to date and time)


For more clarity on datatypes refer to this article.

Please Note: Also set CustomerCode field as primary key by doing right click on CustomeCode

Later we will discuss about Primary key in detail. For time being just note concept of Primary key is set to have uniqueness in the table and null values cannot be passed through it.


And finally save table as “tblCustomer” as shown in the following image.

For the very first time on your installed SQL Server if you go to save the table you will see a window prompted as you see in the following in the image which means that table has to dropped and get it re-created again which can be dangerous to existing data present.


So in order to get rid or prevent saving such changes that requires re-creation of the table we have to certain following changes.

Go to Tools –> then click on Options

Once option window gets opened, then on the left side click on Designer and then on the right side you will see a default check box which is by default checked. So just set it to unchecked so that prevention of saving such changes that requires re-creation of the table is avoided.


All the coding for SSIS project will be done in SSDT with VS 2013

Working with SSDT with VS 2013

Step 1: In order to start SSIS project open SSDT with VS 2013 from the start menu of your local computer system. Please note if you are opening SSDT with VS 2013 for the first time in your system then it may take time to load it depending on the hardware configuration of your system’s motherboard. There after it also ask to set the working environment for your VS 2013. As a new learner we suggest select it to its default setting which is General then later you can change it to your project requirement etc.


Once Start Page is opened click on New Project then click on to Integration Service here you will find two options: – Integration Services Import Project Wizard where you will get readymade template to build SSIS project. And other we have is Integration Services Project in order to build project right from the basic or scratch. Once you select type of project you want to build then give appropriate name to the project here we have given the name as “SSISCustomer” and select location to save it on your local system here we have saved it to D drive location under Customer Information System.


Once project is created you will see at the right hand side Solution Explorer at top followed by the “SSISCustomer” project which we have created and under that you will find project’s other important file like Project.params, Connection Manager, SSIS packages and Miscellaneous.


On Package.dtsx[Design] page if you do not see SSIS Toolbox then enable it from File Menu go under SSIS –> SSIS Toolbox

In this will in turn enable SSIS Toolbox on the Package.dtsx[Design] page as you see in the following image snapshot.


So once SSIS Toolbox is opened in front, then just drag and drop Data Flow Task from SSIS Toolbox on Control Flow of the Package.dtsx[Design] page


One more thing we would like to bring in notice that SSIS Toolbox looks different on Control Flow and different on Data Flow tab following is the image snapshot which depicts the same.

You will also notice that Data Flow tab consist of complete process of ETL.


Pull component on Data Flow: Now under Data Flow drag and drop Flat File Source on Data Flow Design as shown in the following image. Rename the component to make it more readable as we will name it to “read from CSV”. Once renaming is done now either double click or do right click component to configure it.

Please Note: You will see red cross mark symbol on the Flat File Source component which will be present on it until the component is configured.


Extraction

Connect component to actual data(CSV file): Once you click on Edit you will see following Flat File Source Editor where you will get Connect Manager with its help after configuring it you will be able to connect to “read from CSV” component to actual data CSV file.

1. By doing right click on Edit “read from CSV” component.

2. Its Editor will open and under it click on “Connection Manager” in order to configure Connection Manager.

3. Click on “New” to open Connection Manager Editor.

4. Once Connection Manager Editor is opened give a suitable name and add description to it.

5. Now click on General tab.

6. Under General tab, browse for the actual file i.e. CSV file from your local system. Once it is located then select it.

7. Select Format for the data chosen here we have CSV file so we have selected delimited from the drop down.

8. Tick mark the check box as our first row contains names for the Columns.


9. Now go to Columns tab and under select the following from drop down as shown in the image.

10. For Row delimiter select à {CR}{LF}, which means it is end of the row and thereafter it has enter and new row should begin.

11. For Column delimiter select à Comma{,} which means after comma a new column will start.

12. And you will get to see an exact Preview of the CSV file which you see as output.

13. Finally click on OK.


Now on Flat File Source Editor under Connection Manager you will see CSV connection and just below to that you will see another check box which states that it will not skip and retain null values if present in it that if you tick mark it.

Once you are done with it then you can take a final preview of the CSV data which will can be seen just by clicking on Preview button as you see in the following image.

After looking at the Preview close the Data View window and finally also click on the OK to close Flat File Source Editor window.


Now refer to following image, under Data Flow on “read from CSV” component you will find that red marked sign has gone which means that component has been configured. And on “read from CSV” component you will see Red and Blue arrow where Red arrow will give error as output and Blue arrow will give output as data which will act as an input to next component.

Under Connection Managers you will see CSV Connection as a New Connection Manager created.


Transformation

Now we have completed Extraction part we will do Transformation of the data. Goal here is to do upper case the letter of CustomerName column as you can see in the following image


Steps to perform Transformation in SSIS

  1. From SSIS Toolbox drag and drop “Derived Column” which is available under Common Transformation on Data Flow.
  2. Once Derived Column component is on the Data Flow, rename component to “Capital Letter”.
  3. Now drag blue color arrow output data and connect it as input to “Capital Letter” component.
  4. Now configure Capital Letter component just by doing right click or double click it as you see in the following image.


Once you click on Edit “Capital Letter” component it will open Transformation Editor

  1. Give a suitable name, here we have given “UpperCustomerName” to the new Derived Column.
  2. Add new Derived Column as new column “<<add as new column>>”.
  3. Now drag and drop UPPER formula which is available under String Functions on Expression.
  4. Now drag and drop the column name from which new derived column is to be created here we will put “CustomerName” on the Expression as you in the following image.


Loading

We have now configured “Capital Letter” component which is part of Transformation of Data. Now it is the time for final step of doing Loading from ETL process.

Below are the steps to do Loading of Data: –

  1. From SSIS Toolbox under Other Destination drag and drop ADO.NET connection component on Data Flow.
  2. Now rename the new ADO.NET connection component to “Customer DataWare House”.
  3. Drag and drop Blue arrow output data from “Capital Letter” and give it as input to “Customer DataWare House” component.
  4. Finally Edit and do the configuration of “Customer DataWare House” component.


After opening Editor now go to Connection Manager and click on New which will in turn open Configure ADO.NET Connection Manager. Click on “New” to add Data connection.


After you click on New it will open another window where you have to give Server Name.

1. Add Server Name.

Please Note: If you are very new and you do not know Server Name and want to get Server Name then click on File –> Connect Object Explorer on Microsoft SQL Server Management Studio.

Now Connect Object Explorer window will open, under this window you will get to see Server Name here we have Server Name as “KHADAK-PC”.

Once you found the Server Name now again come back to Connection Manager window and put the Server Name as “KHADAK-PC”.

2. Now select database name in order to connect to database. Click on the dropdown under which you will find the database name here we have database name called as “CustomerDataWareHouse”.

3. Before you close the window by clicking OK, click on Test Connection. Once it is successful you can proceed further as you see in the following image screen.


After you click OK now you will see Server Name along with database name added under Configure ADO.NET Connection Manager window as you see in the following image. Now click OK and complete configuration of ADO.NET Connection Manager.


Once OK is pressed you will again come to Editor window, where you will see under Connection manger field Server Name along with database name. Now give the name of the table which can be done by selecting it from dropdown where you will find all the name of the tables from database. Here we will select “dbo”.”tblCustomer”.

Now we will go to Mappings tab, in order to do mappings and set correct mappings of each Columns. See the following image you will find all mappings “Available Input Columns” and “Available Destination” are correct except for “CustomerName” where we have to select and map “UpperCustomerName” from Input Column to Destination Column “CustomerName”.

Once mappings are set to correct Columns the finally click on OK.


On Data Flow you will find that Loading component has also been configured as you find no red mark cross sign on the component.

Now it is the time to check component on Data Flow, its done configuration are working fine. With the help of Debugging feature we will now see working of complete package.

In order to run package in debug mode click on the Start as shown in the following image snapshot or click on ctrl+F5.

Under debug mode you will find package running as shown in the below where you will see on top right side with green mark correct symbol which means that component is working and configured correctly. And on the output data Blue arrow of each component will show 8 rows are passing through it.


Now the package is running fine and successfully. Go to SQL Server Management Studio and check database “CustomerDataWareHouse” table “dbo”.”tblCustomer” under Object Explorer as shown in the following image: –

Under Object Explorer open the hierarchy

Databases –> CustomerDataWareHouse –> Tables à dbo.tblCustomer

And on dbo.tblCustomer do right click and click on “Select Top 1000 Rows” on the right hand side you will see the query written to select Top 1000 Rows and just below that you will see actual data of database table “dbo.tblCustomer” with all 8 rows updated in it and ColumnName data transformed to Upper Case.


Hope that this article helped you to understand and learn SSIS with each step shown above involved to demonstrate it in practical of Extraction, Transformation and Loading process in detail.

Next we will come up with more practical labs in our new article.

Sharing with you one of the promising video from project series from MSBI(SSIS, SSAS and SSRS) hoping that you will like it

Recommendation
Read Learn MSBI Part 2:- Performing Validation on Records after this article.
Page copy protected against web site content infringement by Copyscape

About the Author

Ahteshamax
Full Name: Ahtesham Shaikh
Member Level: Bronze
Member Status: Member
Member Since: 10/3/2016 2:36:04 AM
Country: India
Ahtesham Shaikh
http://www.learnmsbitutorials.net/
Hey Friends, Myself Ahtesham a coder, writer & technical supporter for IT world and for my company Questpond. Having 10+ years of experience in microsoft technologies like WCF, WPF, MVC, Sharepoint, AngularJS, MSBI, Excel 2000 - 2013, Powerpoint, C#, SQL Server and so on. I just find myself happy and satisfy by doing this job. Apart from profession love reading novels and other digital network books. I'm a faculty at Questpond and do take trainings on c#, object oriented programming, msbi, sharepoint, mvc, mvc, angularJS and so on. Feel free get in touch me for one-to-one trainings, offline trainings @Andheri Mumbai, Online training via microsoft live meeting.

Login to vote for this post.

Comments or Responses

Login to post response

Comment using Facebook(Author doesn't get notification)