In this article we will figure out as how to extract values from XML Attributes and present in comma separated values using XQuery
Introduction
In a recent requirement of mine, there has been a problem being stated as under
There are many employees into the system and they can have multiple emails.The emails are being stored into xml attributes as under
<EmployeeEmails>
<Email Value="FirstEmail@employee.com"></Email>
<Email Value="SecondEmail@employee.com"></Email>
<Email Value="ThirdEmail@employee.com"></Email>
</EmployeeEmails>
What we need to generate is a comma separated list as given below.
Emails
------
FirstEmail@employee.com,SecondEmail@employee.com,ThirdEmail@employee.com
Solution 1: Using Node function
Let us first try to extract the node values by using the .nodes() function as shown under
DECLARE @InputData XML = <EmployeeEmails>
<Email Value="FirstEmail@employee.com"></Email>
<Email Value="SecondEmail@employee.com"></Email>
<Email Value="ThirdEmail@employee.com"></Email>
</EmployeeEmails>
SELECT EmailColumn.value('.','varchar(100)') AS EmailData
FROM @InputData.nodes('/EmployeeEmails/Email/@Value') AS EmailTable(EmailColumn)
/*
EmailData
-------------------------
FirstEmail@employee.com
SecondEmail@employee.com
ThirdEmail@employee.com
*/
The .nodes() function creates an "inline" table of XML fragments for each "row" in the table that contains one <Email> node with "Value" attribute(Kindly note that we retrieve the values for the attribute by specifying @ATTRIBUTE NAME). We can then query on and extract the contents of the XML attribute.
Once we get the row-wise records,next we can apply FOR XML PATH to get the rows transformed into columns with comma seperated list as shown under
DECLARE @InputData XML =<EmployeeEmails>
<Email Value="FirstEmail@employee.com"></Email>
<Email Value="SecondEmail@employee.com"></Email>
<Email Value="ThirdEmail@employee.com"></Email>
</EmployeeEmails>
;WITH EmailCTE AS
(
SELECT EmailColumn.value('.','varchar(100)') AS EmailData
FROM @InputData.nodes('/EmployeeEmails/Email/@Value') AS EmailTable(EmailColumn)
)
SELECT ',' + CAST(EmailData AS VARCHAR(MAX))
FROM EmailCTE
FOR XML PATH('')
/*
XML_F52E2B61-18A1-11d1-B105-00805F49916B
-------------------------------------------------------------------------
,FirstEmail@employee.com,SecondEmail@employee.com,ThirdEmail@employee.com
*/
We can make out that the data is not formatted properly as there is a comma in the before any data.Just to give a final touch to the solution, we can employ Stuff function inorder to extract the extra comma at the beginning.The full query with desired output is as under
DECLARE @InputData XML =<EmployeeEmails>
<Email Value="FirstEmail@employee.com"></Email>
<Email Value="SecondEmail@employee.com"></Email>
<Email Value="ThirdEmail@employee.com"></Email>
</EmployeeEmails>
;WITH EmailCTE AS
(
SELECT EmailColumn.value('.','varchar(100)') AS EmailData
FROM @InputData.nodes('/EmployeeEmails/Email/@Value') AS EmailTable(EmailColumn)
)
SELECT Emails = STUFF(EmailList,1,1,'') from
(
SELECT ',' + CAST(EmailData AS VARCHAR(MAX))
FROM EmailCTE
FOR XML PATH('')
)X(EmailList)
/*
Emails
-------------------------------------------------------------------------
FirstEmail@employee.com,SecondEmail@employee.com,ThirdEmail@employee.com
*/
Solution 2: Using Query function
Let us first execute the below code
DECLARE @InputData XML =<EmployeeEmails>
<Email Value="FirstEmail@employee.com"></Email>
<Email Value="SecondEmail@employee.com"></Email>
<Email Value="ThirdEmail@employee.com"></Email>
</EmployeeEmails>
SELECT @InputData.query('data(EmployeeEmails/Email/@Value)') As Emails
/*
Emails
------------------------------------------------------------------------
FirstEmail@employee.com SecondEmail@employee.com ThirdEmail@employee.com
*/
The above query returns a space separated list with the values stored in the XML fragment.To fulfill our objective we can use Replace function to replace the spaces with comma's. But the query() method returns the data of XML type which will hinder to do so.Henceforth, first we need to convert the XML data type to varchar and then we can use the Replace function.
The below query will,however, convert the XML result to string result
DECLARE @InputData XML = <EmployeeEmails>
<Email Value="FirstEmail@employee.com"></Email>
<Email Value="SecondEmail@employee.com"></Email>
<Email Value="ThirdEmail@employee.com"></Email>
</EmployeeEmails>
SELECT @InputData.query('data(EmployeeEmails/Email/@Value)').value('.','varchar(100)') As Emails
Finally, we can use the replace function to accomplish the goal
DECLARE @InputData XML = <EmployeeEmails>
<Email Value="FirstEmail@employee.com"></Email>
<Email Value="SecondEmail@employee.com"></Email>
<Email Value="ThirdEmail@employee.com"></Email>
</EmployeeEmails>
SELECT REPLACE
(
@InputData.query('data(EmployeeEmails/Email/@Value)').value('.','varchar(100)')
,' ' -- replace the white spaces
,',' -- replace the white spaces with commas
) AS Emails
/*
Emails
-------------------------------------------------------------------------
FirstEmail@employee.com,SecondEmail@employee.com,ThirdEmail@employee.com
*/
Conclusion
So in this short article, we have seen as how to parse the XML attribute in order to obtain the data by using Nodes() and Query() functions.Hope this will be helpful to others.Zipped file is attached.