Learn MSBI Part 4:- Fact and Dimension.

Ahteshamax
Posted by in MSBI (SSIS, SSRS, SSAS) category on for Beginner level | Points: 250 | Views : 3741 red flag

In this article we will explain how to create new Dimension Table on SQL Server.

Links to other Labs of Learn MSBI Step by Step.

Learn MSBI Part 1:- Creating SQL tables, Practicals of SSIS-ETL(Extraction, Transformation & Loading)

Learn MSBI Part 2:- Performing Validation on Records

Learn MSBI Part 3:- SCD (Slow Changing Data)

Learn MSBI Part 5:- CUBE Creation

Learn MSBI Part 6:- Displaying reports using SSRS

Learn MSBI Part 7:- DQS (Data Quality Services)

Learn MSBI Part 8:- Data Profiling Task

Learn MSBI Part 9:- Pivot, UnPivot and Aggregation

Learn MSBI Part 10:- Execute SQL Task component (SSIS)

Learn MSBI Part 11:- Send Mail Task & Script Task Component

Learn MSBI Part 12:- CDC Concept using SSIS

Learn MSBI Part 13:- Asynchronous, Synchronous, Full-blocking, Semi-blocking & No-blocking

Learn MSBI Part 14:- Understanding Page Split and maintaining SSIS Performance

Learn MSBI Part 15:- Debugging in deployment – Instrumentation using Data Taps

Learn MSBI Part 16:- Using Multi-threading and its performance in SSIS program


Create new Dimension Table on SQL Server, for that go to SQL Server Management Studio create “New Table” by doing right click on Tables as shown in the following image snapshot.


Once table is opened for creation then add the following to it: –

  • Column Name as “CountryID” with Data Type as “int” and set it as primary key and uncheck Allow Nulls.
  • Column Name as “CountryName” with Data Type as “nvarchar(50)” and uncheck Allow Nulls.

and finally save the table with DimCountry name.


Now we will create other three Dimensions tables named DimProduct, DimSalesPerson & DimStates- similarly in the same way which we did for DimCountry.

Following is the table created with name DimProduct and field name created in it “ProductID” and “ProductName“


Below is the third dimension table created with name DimSalesPerson and SalesPersonID along with SalesPersonName field created in it.


And finally create a dimension table with name DimStates, StatesID and StatesName field created in the table as shown in the image below.


Once Dimensions tables are created now create Measures or Fact table with following field created in it

ColumnName ColumnName
CustomerCode nvarchar(50)
CustomerName nvarchar(50)
CustomerAmount money
SalesDate datetime

Also adding field of foreign key to Fact table to create relation with other Dimension Tables. Here we will add Dimension table name like “Country” with “ID_fk” added to it with DataType as “int” to it. Similarly adding other dimension table names followed by “ID_fk” to it to complete Fact table field as shown below.

ColumnName DataType
CountryID_fk int
StatesID_fk int
ProductID_fk int
SalesPersonID_fk int

Once done with it save the Measure of Fact table with name FactCustomer.


For establishing relationship between fact or measure table and dimension table create database diagram. In order to create Database diagram go to SQL Server Management studio in that go to Object Explorer and click on Database Diagram and do right click on New Database Diagram as shown in the below image: –


Once you click New Database Diagram it will ask to Add Table just select all dimensions table and fact table as shown in the below image: –



Similarly we will add relationship between

  • DimProduct and FactCustomer
  • DimSalesPerson and FactCustomer
  • DimStates and FactCustomer

And then after adding all relationship we will see following image and established relation between Fact and Dimension


After you create relation between Fact and Dimension table very important thing to remember is that to save the Database diagram, for that click save as which will prompt you to save with customized name here we have given name for it as “CustomerDiagram”.


Once diagram is saved then all relations will be retained for future use. At centre Fact table connected to four Dimension tables with foreign-primary relationships saved diagram is termed as Star Schema.

Similarly further to Star Schema diagram if there are relations between Dimension tables established then such diagram will be termed as Snowflake Schema.

Now go to SSDT Visual Studio and under its Solution Explorer create new SSIS package


We will create new and follow packages sequence wise first it will start our package by loading Main.dtsx à then all Master data like Load Country Master à Load States Master à finally loading of Fact Customer table would be done.


Once New SSIS package is added rename that package as “FactCustomer”. Similarly we will be create two more new packages “States” and other as “Main” under SSIS Packages.

1) Starting with a single click and do select on Main.dtsx package.

2) From SSIS Toolbox drag “Execute Package Task” available under Common on Control Flow pane.

3) Edit and rename package as “Country Master”.

4) From SSIS Toolbox drag and drop one more “Execute Package Task” on Control Flow pane.

5) Edit and rename package as “Fact Customer”.

6) Drag and drop output of State Master as input to Fact Customer as shown in the below image.


Please Note: In our project now we have 3 packages now to make “Main.dtsx” as first startup running package for that right click on “Main.dtsx” and click on “Set as StartUp Object”. Once that is done now when project will start, “Main.dtsx” will execute first then later other packages will run.


Now assign package to the component “Country Master” and in order to do that to select that package name which will be executed.

  • Do a double click “Country Master” component, open task editor.
  • On left side click “Package”.
  • now on right side under “ProjectNameFromProjectReference” select “Country.dtsx” from drop down list and
  • Finally click OK as shown in the below image snapshot.

Similarly we will do assigning of package to the component “Fact Customer” for that just follow above steps.

Once packages are assigned to the component then save the project.


Now as we are all set with Main.dtsx package we will load data in Country.dtsx package by pulling in Data Flow Task on Control Flow pane and rename it to “Load Country”.



After input and output component are selected and renamed it need to be configured and for that

  • Do double click or do right click to open editor to configure it.
  • On Flat File Editor, click on Connection Manager.
  • Click on New which will open Flat File Connection Manager Editor.
  • Click on General and browse for file, named “Country.txt”.


  • Next click on Columns on left side and do preview of country.csv file.
  • Finally click on OK.


Next we will be configuring ADO.NET Destination but in this lab we will not configure which we did in the earlier lab but here will learn a concept of global connection manager. This will help us to make one connection manager created at project level that can be consumed by other packages (States.dtsx, SalesPerson.dtsx and Product.dtsx) too as all our packages we will be using and connecting to same server.

In order to set global connection manager on project level below are the practical steps included.

  • Do right click on Connection Manager and click on New Connection Manager.
  • It will open a new window under it select “ADO.NET” from various types of connection and set it as SSIS Connection Manager type followed by clicking Add to it.
  • Now you will see Configuring ADO.NET Connection Manager window open under this selection Date connections select server and database name, here in the following image name is KHADAK-PC.CustomerDataWareHouse.
  • Finally click OK.


Under Solution Explorer you will see KHADAK-PC.CustomerDataWareHouse added as global connection manager under Connection Managers as shown in the below image.


After adding Connection Manager as global now go and configure ADO.NET Destination component on Data Flow pane by doing right click Edit or double clicking the component

Under ADO.NET Destination Editor click on Connection Manager on the left side and on right side you will find that automatically connection manager is available just below to select table name “dim.DimCountry” from the table dropdown list available.

Now click on Mappings just check whether Input Column is properly mapped to Destination Column. Once found Mapping is correct then click on OK.


As now everything is set and configured now click on “Start” button available on top to run or execute “Country.dtsx” package. From the following image snapshot at the top right side of component green colour check mark represents that package has executed successfully.

3 new rows has been successfully inserted in to database table from input Country.txt file.


Now coming to SQL Server Management Studio under Object Explorer inside database “CustomerDataWareHouse” do right click on “dbo.DimCountry” and click Edit Top 200 Rows which will display values of tables and 3 new inserted values which was done by executing the package.


Exactly now we will replicate the all of the above steps for our next Dimension “DimProduct”. On Visual Studio under Solution Explorer à SSIS Packages à do right click and click on “New SSIS Package” and name package as “Product.dtsx”.


On Control Flow pane drag and drop “Data Flow Task” from SSIS Toolbox and rename this component to “Load Product” as shown in the below image figure.

Next is double click the component which will take it to Data Flow pane, here drag and drop component from SSIS Toolbox

  • From Other Sources “Flat File Source” available under SSIS Toolbox drag and drop component on Control Flow Pane and rename it to “Product txt”.
  • From Other Destinations drag “ADO.NET Destination” and drop on Control Flow Pane.


Now once all required component are pulled we will open its Editor and configure it as shown in the image below.

  • Double click the component “Product txt” and click Connection Manager on left side under Editor.
  • Here click on “New” which will in turn open up Connection Manager Editor under this Editor click on “General” available at the left side now browse and select the file named “Product.txt” available on local system.
  • Next click on “Columns” at the left and do final preview of “Product.txt” file and then click OK.


Next is to configure ADO.NET Destination just do double click on component which will open Editor over here click on Connection Manager which will show global connection manager below to that there is a field to select table click on the dropdown and from that select from available table named “dbo.DimProduct”.


Next is click on Mappings on the left to view Input Column and Output Column mappings once found everything is correct just click on OK.


Click Start button on the top to run or execute the package of “Product.dtsx” it will show executed successfully package with green mark on it. It has now inserted 3 new rows in the table.


After package successful execution we will check all values are correctly inserted into database table. For that follow same steps which we did for earlier “DimCountry” table. Do right click on “DimProduct” and click on “Edit Top 200 Rows”, so here is the table with final output as of input Product.txt notepad.


Similarly we will create remaining other packages by doing right click on SSIS Packages under Solution Explorer of SSDT VS 2013 with name “SalesPerson.dtsx” and “States.dtsx” also loads its dimensions values. For that exactly just follow all of the above mentioned steps which has been demonstrated for “Country.dtsx” and “Product.dtsx” package till running SQL table.

This would be a homework for the reader to do the practice on their own to fill in the package and get its values on SQL table.


Once all the dimensions are created then on Main.dtsx package page do create the following: –

Drag and drop “Execute Package Task” from SSIS Toolbox on Main.dtsx package à Control Flow pane. For each package which we have “Country.dtsx”, “Product.dtsx”, “States.dtsx”, “SalesPerson” and one for “FactCustomer.dtsx”drag and drop total five “Execute Package Task” component from SSIS Toolbox .

Once all the “Execute Package Task” component are on Main.dtsx à Control Flow pane now rename it for ease of understanding and to avoid confusion so if naming is done accordingly to the package name then life would much more simpler as listed below.

Package Name Execute Package Task Name
Country,dtsx Country Master
States.dtsx States
SalesPerson.dtsx Sales Person
Product.dtsx Product
FactCustomer.dtsx Fact Customer

Once the naming is done now each component needs to be connected in the same manner where one component output is given as input to next component exactly shown in the below image: –

Country Master –> States –> Sales Person –> Product –> Fact Customer.


Next thing is to remove red cross mark for that need to configure by giving project reference to it to each of these “Execute Package Task” component which has been renamed and connected to each component. First start with “Country Master” just do double click on it which will open Task Editor now click on Package available at the left and then on the right side select the package from available dropdown PackageNameFromProjectReference –> Country/Package.dtsx and click on OK.

Same configuring step do follow for States component by doing double click on it and select for it “States.dtsx”. So it will be PackageNameFromProjectReference –> States.dtsx and click on OK.


Followed by doing setting project reference to other remainings “Executing Package Task”

PackageNameFromProjectReference for “Sales Person” –> SalesPerson.dtsx

PackageNameFromProjectReference for “Product” –> Product.dtsx

PackageNameFromProjectReference for “Fact Customer” –> FactCustomer.dtsx

Please Note: – For time being deactivate Fact Customer component.


And save “Main.dtsx” package then finally run and execute the package by clicking on “Start” available option at the top. Once done that will load package values in the SQL Server table.

Following image shows complete dimension loaded with package running successful and values loaded of each dimension into SQL Server table.


Creating & Executing Fact table

Now next remaining is loading of “FactCustomer” package and then its values loading it into SQL Server table.

Below are the details of CustomerFact text file will be loaded into the package.


On “FactCustomer” package first drag and drop Data Flow Task on Control and rename the component to Load FactCustomer. Then drag and drop “Flat File Source” from SSIS Toolbox on and do rename it to “FactCustomer txt” as user friendly name for reading simplicity.

Along with it also drag and drop “ADO.NET Destination” from SSIS Toolbox on Data Flow pane.

Now do a double click on “CustomerFact txt” to configure it and click General tab available at the left side, here give the text file name “CustomerFact.txt” which is available on the local system.

Then click on “Columns” tab available on left side and take a final preview look and click on OK.


Once preview shown is perfect then click OK. And here it is component with yellow mark which is due to SSIS component out of sync from Connection Manager. For do right click and Edit the component and click on Column on the left and make sure all Columns are seen the click OK


Once all Columns are available and SSIS component is also in sync with the Connection Manager the yellow sign on input source has gone away.

Next is to configure ADO.NET Destination and remove red mark from it for that double click and open its Editor and now click on Connection Manager available at the left and then select “dbo.FactCustomer” and click OK.


Now execute the package by clicking on Start which will give a green symbol on the component which means that package has been successfully executed and values has been loaded properly.


The same we will check into database whether table is filled with correct values.


Tables are filled with correct values as shown in the table. Need here is to fill external dimension table which is having references foreign key relation with Fact table as we have done mappings only for first four fields and not fields with foreign keys references i.e. populate these fields too.

Loading these fields is not straight as such we did for other fields of Fact table by doing mappings in it. This can be achieved by introducing LOOPUP in it where it will use ID as reference for the name of country, states, sales person or product and will display that in the foreign key field.


In order to achieve the same on SSIS we have a Toolbox under Common à LOOKUP. For that drag and drop the LOOKUP component between source and destination component as shown in the image below and double click on it, to configure.


LookupCountry

1) It will open Editor screen to configure LOOKUP component.

2) On Transformation Editor screen click on “Connection“.

3) Click “New” to add connection manager.

4) Click on the available database name here “Khadak-PC.CustomerDataWareHouse”.

5) Click OK.

6) Now select the table, here we will have lookup on “dbo.DimCountry”.


7) Now click on Columns to do equi-join between input (i.e. text) and the reference data of lookup columns.

8) On Available Lookup Columns tick on CountryID.

9) drag and drop Available Input Column “CountryName” on Available Lookup Columns which is “CountryName”.

Once drag and drop is done instead of doing mapping it will give an error.


As per the error given by Visual Studio it is mismatching of datatype which is non-unicode coming from the text file and Unicode datatype available in the SQL Server.


In order to get rid of this conversion error which will be resolved by introducing one more component on Data Flow pane from Common of SSIS Toolbox between “CustomerFact txt” and “LookupCountry” as shown in the below image. Next is drag and drop output of “CustomerFact txt” on as input to “Data Conversion”. Double click the Data Conversion component to open Editor and configure it.


Now select four Input columns “CountryName”, “StatesName”, “ProductName” and “SalesPersonName” on these inputs datatype conversion will be done from non-unicode to unicode by setting datatype to “Unicode string [DT_WSTR]”. No here give new name to changed input with Unicode data type as follows: –

old with non-unicode new with unicode
CountryName UCountryName
StatesName UStatesName
ProductName UProductName
SalesPersonName USalesPersonName

And then click on OK.


Next is drag “Data Conversion” output as input to “LookupCountry” and configure it by opening its Editor. Here drag new input “UCountryName” on Lookup Column “CountryName”. Rest other necessary things are set fine and then click on OK.


LookupStates

Next again, 1) drag and drop new Lookup from the Common of SSIS Toolbox on Data Flow.

2) Rename newly dragged and dropped component as “LookupStates”.

3) Drag and drop output of “LookupCountry” as input to “LookupStates” which will in turn pop up a dual selection output screen, here select output from dropdown as “Lookup Match Output”.

4) Click OK.


Once input is given to “LookupStates” now configure it by double clicking on this component which will open Editor screen.


Once Transformation Editor screen is open,

1) Click on “Connection” on the left.

2) Now OLE DB connection manager is set with default connection as it was set earlier for LookupCountry, so keep it set to default connection manager. Set table name available from the dropdown list as “dbo.DimStates”.

3) Moving ahead on left side click on “Column”.

4) On Lookup Column do a tick mark on “StatesID” as this will be used for lookup.

5) Drag “UStatesName” from Input Column and drop it on “StatesName”.

6) Click on OK.


LookupProduct

Again drag and drop new lookup from Common available under SSIS Toolbox on Data Flow and rename the same as “LookupProducts”. Next again drag output of “LookupStates” and drop it as input to newly created “LookupProducts” which will open new screen to set output. Select the same output which we did earlier by setting it to “Match Output”. And then click on OK.


To configure “LookupProducts”, 1) double click and open its Editor.

2) Click on “Connection” available option on the left.

3) Here set table to “DimProducts” as default set connection manager will be used.

4) Next is to click on “Column” available option on the left.

5) Do a tick mark on “ProductID” on Lookup Columns.

6) Drag “UProductName” and drop it on “ProductName” of Lookup Columns as shown in the image below.

7) Once all things are set on Editor click on OK to close the editor screen.


LookupSalesPerson

Next configuring of following LOOPUP component will be done by dragging and dropping it on Data Flow pane. Rename it to “SalesPerson”. Now connect it to the output coming from “LookupProducts” as shown in the image below. It will again prompt with screen to set output, set it to “Lookup Match Output”. Finally click OK it automatically connects its output as input to “LookupSalesPerson”.


Next we will configure by opening its Editor and under “Connection” set its table value to “DimSalesPerson”.

Next on “Column” option, do a tick mark on “SalesPersonID”. Also drag and drop “USalesPerson” new Unicode input value on “SalesPerson” as shown in the following image. Finally click OK button.


Drag output of LookupProduct component as input to ADO.NET Destination where again it will ask to select type of output which need to be used so following the similar fashion which we did earlier to all lookup outputs here will do the same and select “Match Output” and click on OK.


Mapping Columns in ADO.NET Destination

Once components are connected then open ADO.NET Destination component Editor to configure it.


Once Editor is open click on the Mappings tab available on the left side. Here do the mappings of the ID’s as follows:

Input Column Destination Column
CountryID CountryID_fk
StatesID StatesID_fk
ProductID ProductID_fk
SalesPersonID SalesPersonID_fk

After finishing with the mapping just do a final recheck on all the columns and then click on OK to close the screen of the Editor.


Since all things are set now rebuild your complete solution by clicking on Build à Rebuilt Solution on Visual Studio as shown in the below screen.


Execute FactTable in SSDT Vs 2013

Now run or Execute package by clicking on Start button available on the top, once running of package is successfully done you will see following screen on Visual Studio with green arrow mark on all the components.


Output in SQL Server Management Studio

It is the time to see final output value gets loaded into the table. For that go to SQL Server Management Studio do right click on “dbo.FactCustomer” and click on “Edit Top 200 Rows” to see the ID_fk fields are filled with ID values from their respective dimension table.

Once the table is open now all fields are filled with values including ID’s from dimension table into Fact table.


Also see our below Learn MSBI Step By Step project series videos: -

Page copy protected against web site content infringement by Copyscape

About the Author

Ahteshamax
Full Name: Ahtesham Shaikh
Member Level: Bronze
Member Status: Member
Member Since: 10/3/2016 2:36:04 AM
Country: India
Ahtesham Shaikh
http://www.learnmsbitutorials.net/
Hey Friends, Myself Ahtesham a coder, writer & technical supporter for IT world and for my company Questpond. Having 10+ years of experience in microsoft technologies like WCF, WPF, MVC, Sharepoint, AngularJS, MSBI, Excel 2000 - 2013, Powerpoint, C#, SQL Server and so on. I just find myself happy and satisfy by doing this job. Apart from profession love reading novels and other digital network books. I'm a faculty at Questpond and do take trainings on c#, object oriented programming, msbi, sharepoint, mvc, mvc, angularJS and so on. Feel free get in touch me for one-to-one trainings, offline trainings @Andheri Mumbai, Online training via microsoft live meeting.

Login to vote for this post.

Comments or Responses

Login to post response

Comment using Facebook(Author doesn't get notification)