Import XML File and transform into Relational Model (for STATIC Columns) using Powershell

Rajnilari2015
Posted by in PowerShell category on for Beginner level | Points: 250 | Views : 845 red flag
Rating: 5 out of 5  
 1 vote(s)

In this article, we will load XML data into the relational model using PowerShell. We will use SQL Server 2014 as our RDBMS.


 Download source code for Import XML File and transform into Relational Model (for STATIC Columns) using Powershell

Introduction

Importing XML data into the relational database is being supported by the introduction of OPENXML and OPENROWSET.

OPENROWSET imports a file on disk as a single blob.It comes into use if we need that file content as a single big chunk of information or if we want to store the entire file contents as a single XML field.

If we need the data represented in the XML as columns and rows and then we want to store that information in a relational table in the form of rows and columns, it is then OPENXML.

We can use this nice bolg post for more information on the subject.

But in this article, we will take a different approach to load XML data into the relational model.We will use PowerShell Script to read the XML and with the help of xp_cmdshell, we will execute that command, then by using a little of T-SQL, we will transform the data to be fit for a relational model.We are using SQL Server 2014 as our RDBMS. So then let's start our journey.

Environment Setup

Let us use the below XML say(Employees.xml)

<Employees>
    <Employee EmpId="Emp001" EmpName="Rajlakshmi" DOJ="02/03/2010"/>   
    <Employee EmpId="Emp002" EmpName="Niladri" DOJ="10/08/2010"/>
    <Employee EmpId="Emp003" EmpName="Arina" DOJ="01/01/2010"/>
    <Employee EmpId="Emp004" EmpName="RNA Team" DOJ="31/12/2010"/>
</Employees>

Using the code

Open any text editor (save the file as "Employees.ps1") and let us write the following command

 [xml]$xml = Get-Content D:\Employees.xml
 $xml.Employees.Employee

In the very first line, we wrote

[xml]$xml = Get-Content D:\Employees.xml

The Get-Content cmdlet reads the content from the "Employees.xml" one line at a time and returns a collection of objects that represents a line of content.By using [xml], we are typecasting the object returned by the Get-Contentcmdlet to the type [xml] .Generally, the Get-Content cmdlet returns an array.But by typecasting into [xml] , we are converting that to System.Xml.XmlDocument type.

We can even check that by typing the below command in the "Windows PowerShell Prompt".

$xml.GetType()

So, now the XML document is in memory and we can easily traverse the XML Document using the below way.

$xml.Employees.Employee

Now let us open the SSMS and write the below query.

DECLARE @command  VARCHAR(100)
SET @command = 'powershell.exe  D:\Employees.ps1'
exec master..xp_cmdshell @command 

Output:

We got the output.That's fine!!! But is that a readable format? Because xp_cmdshell returns as rows of text.Now our task is to convert that to a relational way.

For that, we can take help of the Format Commands that helps to change the view.For our purpose, we will take help of Format-List, which will present the records in a key:value pair as shown under - 

 $xml.Employees.Employee | Format-List

So we have modified our "Employees.ps1" content as under - 

 [xml]$xml = Get-Content D:\Employees.xml
 $xml.Employees.Employee | Format-List

So if we fire the same command from SSMS, this time the output will be - 

Output:

Though, it is still far beyond to be call as relational model, but now we can make it so by using T-SQL program. Let's see how...

DECLARE @command  VARCHAR(100)
SET @command = 'powershell.exe  D:\Employees.ps1'

--Create a Temporary table
CREATE TABLE #TblResultXML(EmpId INT IDENTITY(1,1), ResultData VARCHAR(1000) NULL)

--Using xp_cmdshell, invoke the PowerShell script and insert that into the Temporary table
INSERT INTO  #TblResultXML (ResultData) EXEC master..xp_cmdshell @command 

--Project the Records
SELECT 
	EmpId
	,ResultData
FROM #TblResultXML
WHERE ResultData IS NOT NULL AND EmpId > 2 

--Drop the Temporary table
DROP TABLE #TblResultXML

Result:

Since the T-SQL script is heavily documented, so there is needless to provided further information regarding that.However, the output is now quite handy to read and will be easy to parse and transform to relational model.Our first task is to separate the Key and the Values.Let's modify our query as under - 

DECLARE @command  VARCHAR(100)
	SET @command = 'powershell.exe  D:\Employees.ps1'

	--Create a Temporary table
	CREATE TABLE #TblResultXML(Id INT IDENTITY(1,1), ResultData VARCHAR(1000) NULL)

      --Using xp_cmdshell, invoke the PowerShell script and insert that into the Temporary table
	INSERT INTO  #TblResultXML (ResultData) EXEC master..xp_cmdshell @command 

	
    -- Seperate the Keys and Values
	;WITH keyValuesSeperateCTE AS(

	    SELECT 
		EmpId
		, [Key] = LEFT(ResultData, CHARINDEX(':', ResultData)-1)
		, [Value] = RIGHT(ResultData, LEN(ResultData) - CHARINDEX(':', ResultData))
	     FROM #TblResultXML
	     WHERE ResultData IS NOT NULL AND EmpId > 2
	)    
   

	--Drop the Temporary table
	DROP TABLE #TblResultXML

Result:

For a cleaner approach, we are using Common Table Expression(CTE).The very next step is to Group the "Keys" as under - 

DECLARE @command  VARCHAR(100)
	SET @command = 'powershell.exe  D:\Employees.ps1'

	--Create a Temporary table
	CREATE TABLE #TblResultXML(EmpId INT IDENTITY(1,1), ResultData VARCHAR(1000) NULL)

      --Using xp_cmdshell, invoke the PowerShell script and insert that into the Temporary table
	INSERT INTO  #TblResultXML (ResultData) EXEC master..xp_cmdshell @command 

	-- Seperate the Keys and Values
	
	;WITH keyValuesSeperateCTE AS(

	SELECT 
		EmpId
		, [Key] = LEFT(ResultData, CHARINDEX(':', ResultData)-1)
		, [Value] = RIGHT(ResultData, LEN(ResultData) - CHARINDEX(':', ResultData))
		FROM #TblResultXML
		WHERE ResultData IS NOT NULL AND EmpId > 2
	),	
	--Group the Keys 
	groupKeysCTE AS(

		SELECT 
			    Id = ROW_NUMBER() OVER(PARTITION BY X.[Key] ORDER BY X.EmpId)
				,[Key] 
				,[Value]
		FROM KeyValuesSeperateCTE X
	)

	SELECT *
	FROM groupKeysCTE	
	
	--Drop the Temporary table
	DROP TABLE #TblResultXML

Result:

Now we are very close.As a last step we only need to transform the rows to columns by using PIVOTING as shown under - 

DECLARE @command  VARCHAR(100)
	SET @command = 'powershell.exe  D:\Employees.ps1'

	--Create a Temporary table
	CREATE TABLE #TblResultXML(EmpId INT IDENTITY(1,1), ResultData VARCHAR(1000) NULL)

      --Using xp_cmdshell, invoke the PowerShell script and insert that into the Temporary table
	INSERT INTO  #TblResultXML (ResultData) EXEC master..xp_cmdshell @command 

	-- Seperate the Keys and Values
	;WITH keyValuesSeperateCTE AS(

	SELECT 
		EmpId
		, [Key] = LEFT(ResultData, CHARINDEX(':', ResultData)-1)
		, [Value] = RIGHT(ResultData, LEN(ResultData) - CHARINDEX(':', ResultData))
		FROM #TblResultXML
		WHERE ResultData IS NOT NULL AND EmpId > 2
	),
	--Group the Keys 
	groupKeysCTE AS(

		SELECT 
			Id = ROW_NUMBER() OVER(PARTITION BY X.[Key] ORDER BY X.EmpId)
			,[Key] 
			,[Value]
		FROM KeyValuesSeperateCTE X
	)
	
	--PIVOT the result
	SELECT [EmpId], [EmpName], [DOJ]	
	FROM groupKeysCTE
	PIVOT(
			MAX(Value)
			FOR [Key] IN ([EmpId], [EmpName], [DOJ] )
		) AS PIV	
	
	--Drop the Temporary table
	DROP TABLE #TblResultXML

Result:

References

a)PowerShell Data Basics: XML

b)Running PowerShell scripts from SSMS Query Editor

Conclusion

In this article we learnt to Import XML File and transform into Relational Model using Powershell. But have you noticed that the article assumes the columns are fixed.How about dynamic columns?.Yes, that's the next one. Hope this will be helpful.You can find the zipped files attached herewith.Thanks for reading the article.

Page copy protected against web site content infringement by Copyscape

About the Author

Rajnilari2015
Full Name: Niladri Biswas (RNA Team)
Member Level: Platinum
Member Status: Member,Microsoft_MVP,MVP
Member Since: 3/17/2015 2:41:06 AM
Country: India
-- Thanks & Regards, RNA Team


Login to vote for this post.

Comments or Responses

Login to post response

Comment using Facebook(Author doesn't get notification)