In this article we will know about Asynchronous, Synchronous. Full blocking, Semi Blocking, and No - Blocking Keywords.
Links to other Labs of Learn MSBI Step by Step.
Learn MSBI Part 1:- Creating SQL tables, Practicals of SSIS-ETL(Extraction, Transformation & Loading)
Learn MSBI Part 2:- Performing Validation on Records
Learn MSBI Part 3:- SCD (Slow Changing Data)
Learn MSBI Part 4:- Fact and Dimension.
Learn MSBI Part 5:- CUBE Creation
Learn MSBI Part 6:- Displaying reports using SSRS
Learn MSBI Part 7:- DQS (Data Quality Services)
Learn MSBI Part 8:- Data Profiling Task
Learn MSBI Part 9:- Pivot, UnPivot and Aggregation
Learn MSBI Part 10:- Execute SQL Task component (SSIS)
Learn MSBI Part 11:- Send Mail Task & Script Task Component
Learn MSBI Part 12:- CDC Concept using SSIS
Learn MSBI Part 14:- Understanding Page Split and maintaining SSIS Performance
Learn MSBI Part 15:- Debugging in deployment – Instrumentation using Data Taps
Learn MSBI Part 16:- Using Multi-threading and its performance in SSIS program
Till now we have seen that how package is executed and its component run successful or it fails depending how it has been configured and so on. But in actual if you want to became a good SSIS developer then knowing each SSIS component execution behavior is equivalently important, how are they consuming memory, are they running it in multi-threaded manner or not and so on.
For that knowing and learning of these following 5 keywords are necessary essentials: –
- Semi-blocking &
Before discussing each component in detail first thing would like to share is that data instead of moving row wise from one component to other component within the package it is send in buffers. For each buffer SSIS package has allocated size to the buffer and within this buffer size specific number of rows will move in it.
Starting with first two keywords, Asynchronous and Synchronous
Asynchronous: When any component creates new buffers and do not use existing buffers coming from the output of preceding component to which it is connected then that component is termed as Asynchronous.
Synchronous: Similarly when component uses existing buffer and do not create new buffer to send data it is termed as Synchronous components.
Following is the pictorial image for the same which shows flow of Asynchronous and Synchronous.
Asynchronous component are slow as compared to synchronous component. More the Asynchronous component within the package it will led to slowdown and performance will degrade.
Identifying the component: –
In order to know that whether the component is Asynchronous or Synchronous there is a need to enable logging, for that go to Control Flow pane and do right click on the pane and click on Logging as shown in the following image.
It will open configure SSIS Logs package window and then follow below steps: –
Step 1: Mark on the checkbox of both “Package” and “Load” option available on the left side.
Step 2: Set Provider type as “SSIS log provider for Text files”, it can be set from the option available under dropdown & then click on Add.
Step 3: Select the logs to be used for the container for that mark on the checkbox and select
<New Connection> under configuration from options available under the dropdown.
Step 4: Set Usage type as “Create file” from dropdown.
Step 5: Browse and set the location on the local drive to save newly created text file and also give a suitable name, here name given is “Log.txt” to the file.
Step 6: Click on OK to close the File Connection Manager Editor window.
Next under the configuration file name “Log.txt” can be seen.
Step 7: Click on “Details” tab.
Step 8: From here select two events by marking on the check box of “PipelineExecutionPlan” and “PipelineExecutionTrees”.
Above mentioned checked events will help to let know whether the component is in Asynchronous or Synchronous.
After that finally click on the OK to close log package window screen.
Next to view the results, there is a need of log file to be created and for that package should be executed successfully.
After successfully execution of the package check the location on the local drive for created log file.
Once log file is identified on the location of the local hard drive where it has been created. Just do double click and open it, and will notice that lots of text being written on the log file notepad.
Very first line on the notepad represents Load component present on Control Flow pane which can be easily identified by comparing ID mentioned on notepad and properties of Control Flow. Please note to see properties of each component do right click on the component and click on “Show Advanced Editor” which will open window screen of the Editor and then click on the Component Properties.
Similarly following line on the notepad is of Data Flow Task which can be identified by same manner as it was done earlier.
But main goal is to identify from the text number of buffers created for that just go through the line Begin Path 0 and End Path 0, Begin Path 1 and End Path 1 and so on…..in which Begin Path 0 means that one buffer is created while End Path 0 means end of that buffer. Similarly Begin Path 1 and End Path 1 and followed by others thereafter.
From the “log.txt” notepad and as shown in the below image from the same notepad “ADO.NET source” component is creating first buffer as it lies between Begin Path 0 and End Path 0.
Next is “Sort” component lies between Begin Path 1 and End Path 1
And finally “Aggregate” component lies between Begin Path 2 and End Path 2.
In this case three buffers are created one by ADO.NET source component, other by Sort component and by Aggregate component. Rest other component which do not included between Begin Path and End Path are termed as Synchronous component.
ADO.NET source component is the source component which is but obvious to be Asynchronous as it going to take input data from other source.
In order to understand above log text file which is filled with class ID and input/output codes etc and other details. Following are the image snapshot of each asynchronous component along with their respective properties which shows their Class ID and component identity and so on. Comparing both log.txt file and component properties it would be very much easy to identify and understand text file each log.
To see properties of each component do right click on the component and click on “Show Advanced Editor” which will open window screen of the Editor and then click on the Component Properties.
First image snapshot shows the properties of ADO.NET source component.
Second image snapshot shows the properties of Sort component.
Third below image snapshot shows the properties of Aggregate component.
Identifying Semi-blocking, no blocking and Full blocking
No blocking or non-blocking: Component which does not stop/block full data transfer to next component while self-executing/processing itself is termed as No blocking or non-blocking.
Semi-blocking: Component which does partial data transfer to next component while self-executing/processing itself and then does complete data transfer completion after its complete self-execution that is termed as Semi-blocking.
Full Blocking: As the name says it will not at all transfer data to other component unless it has processed itself.
So well as our concept of blocking is clear, now before proceeding with its practical of identification first understand working of component without which identifying component would also be going to difficult.
Working of SSIS component: There are five important phase that runs when any of the SSIS component is executed/processed.
- Validate: It checks the component connection whether input and output is connected properly or not. It also check if any XML available or not depending upon the situation XML check is done etc.
- PreExecute: In this phase it will check for component initialization whether its input has been correctly filled with data or not.
- ProcessInput: Once the correct input is filled then its processing will be done depending on the component like sorting component which will start sorting of the data etc.
- PostExecute: After the ProcessInput phase takes place then component’s post execution will start.
- PrimeOutput Filling: Till above four phases which takes place is part of components behaviour when its execution takes place. But this fifth phase is very much important as this will help us to identify not only the component is asynchronous or synchronous but also they are fully blocking or semi-blocking. So whenever a component is taking much more time to fill in the output then it means it is asynchronous as it is creating buffers and full blocking or semi-blocking.
Now let’s see practical timing which identifies component is semi-blocking or full blocking with the help time needed.
How to view time needed in each event execution of a component?
Practical of Identification: Once viewing of time execution is available then it would be very much easy to recognize it. For that click and go to Control Flow pane, next is do a right click and click on “Logging” as shown in the down image below
Once the configure log is open on the left side click on “Load” to show events option available Load container of Control Flow.
Now do a mark on the checkbox of event option “PiplelineComponentTime” as shown in the image shown below this enabled event will help to show timing for each phase taken especially for the one which we are looking “PrimeOutput Filling”. Next is click on OK to close the “Configure SSIS Logs“ window.
Then click and go to Data Flow tab followed by executing the package for that click on the “Start” button available on the top which will run each component and show its execution some components are running fast some are slow. Once complete execution of the package is successfully done with all green colour marks on each component as shown in the image below.
Please note: in simple and easy project it is easy to identify by looking at the execution which component consuming less or more time. But believe in big projects where there are hundreds of components it is very much difficult to identify it.
Below image shows successful execution of components.
Once the component is executed successfully go and open log.txt file which will show complete details for all the phase time taken and other required information.
Frome the below image which is of “log.txt” file starting with validate phase it will show all phase details.
As decided earlier more interest is in knowing “PrimeOutput filling” buffers and looking at the below image output is available with two components one for ADO.NET Source and second is for Aggregate component. So from the below figure it is clear that ADO.NET Source and Aggregate component are Asynchronous & may be full blocking or semi-blocking component as there is more spent time seen.
Hope with above practical demonstration concept of Asynchronous, Synchronous, Full-blocking, Semi-blocking & No-blocking is clear which will help in improving the performance.
For reference below is the readymade chart which shows list of all the components depending on their working nature are categorized in Semi-blocking, no blocking and Full blocking.
Also see our below Learn MSBI Step By Step project series videos: -