Ways to store special character in XMl data type in SQL Server

Niladri.Biswas
Posted by in Sql Server category on for Beginner level | Points: 250 | Views : 52313 red flag
Rating: 4 out of 5  
 1 vote(s)

This article will demonstrate the way to store special character like less than(<), greater than (>), ampersand(&) in XML data type


 Download source code for Ways to store special character in XMl data type in SQL Server

Introduction

Recently, I came across a situation where we need to store some emails with special characters (e.g. ampersand (&)) into an xml column.In this article we will see some of the approaches of handling the situation

Environment setup

Run the below script

DECLARE @T TABLE (OldEmailAddress VARCHAR(50))

--Declare 5 input variables
DECLARE @SimpleEmail VARCHAR(50) = 'mail2dotnetfunda@dotnetfunda.com'
DECLARE @EmailWith_Ampersand_SpecialCharacter VARCHAR(50) = 'mail2&.netfunda@.netfunda.com'
DECLARE @EmailWith_LessThan_SpecialCharacter VARCHAR(50) = 'mail2<.netfunda@.netfunda.com'
DECLARE @EmailWith_GreaterThan_SpecialCharacter VARCHAR(50) = 'mail2>.netfunda@.netfunda.com'
DECLARE @EmailWith_AllThe3_SpecialCharacter VARCHAR(50) = 'mail2<.net>funda@&.netfunda.com'

--Insert the records into the table
INSERT INTO @T
VALUES(@SimpleEmail)
,(@EmailWith_Ampersand_SpecialCharacter)
,(@EmailWith_LessThan_SpecialCharacter)
,(@EmailWith_GreaterThan_SpecialCharacter)
,(@EmailWith_AllThe3_SpecialCharacter)

--Project the records
SELECT *
FROM @T

Output

What is expected?

We need to create another column say "NewEmailAddress" which will be of type XML and need to import the data from "OldEmailAddress" column in a special format

The format is as under

<EmailCollections>
<Email>[EMAIl ADDRESS CONTENT FROM OldEmailAddress column]</Email>
</EmailCollections>

What happens if we just create another column and try to import the data?

Let us try to do so

DECLARE @T TABLE (OldEmailAddress VARCHAR(50),NewEmailAddress XML)

--Declare 5 input variables
DECLARE @SimpleEmail VARCHAR(50) = 'mail2dotnetfunda@dotnetfunda.com'
DECLARE @EmailWith_Ampersand_SpecialCharacter VARCHAR(50) = 'mail2&.netfunda@.netfunda.com'
DECLARE @EmailWith_LessThan_SpecialCharacter VARCHAR(50) = 'mail2<.netfunda@.netfunda.com'
DECLARE @EmailWith_GreaterThan_SpecialCharacter VARCHAR(50) = 'mail2>.netfunda@.netfunda.com'
DECLARE @EmailWith_AllThe3_SpecialCharacter VARCHAR(50) = 'mail2<.net>funda@&.netfunda.com'

--Try to insert the records into the table
INSERT INTO @T
VALUES(@SimpleEmail,@SimpleEmail)
,(@EmailWith_Ampersand_SpecialCharacter,@EmailWith_Ampersand_SpecialCharacter)
,(@EmailWith_LessThan_SpecialCharacter,@EmailWith_LessThan_SpecialCharacter)
,(@EmailWith_GreaterThan_SpecialCharacter,@EmailWith_GreaterThan_SpecialCharacter)
,(@EmailWith_AllThe3_SpecialCharacter,@EmailWith_AllThe3_SpecialCharacter)

--Project the records
SELECT *
FROM @T

It will throw the below error

Msg 9421, Level 16, State 1, Line 8

XML parsing: line 1, character 7, illegal name character

Why is the error?

It is very obvious. XML has some "special" characters and they need to be escaped or encoded to prevent being misinterpreted.

So what is the solution?

Well, there are indeed solutions for this problem and we will look into them here

Solution 1:Use the Replace function to replace the special characters

DECLARE @T TABLE (OldEmailAddress VARCHAR(50),NewEmailAddress XML)

--Declare 5 input variables
DECLARE @SimpleEmail VARCHAR(50) = 'mail2dotnetfunda@dotnetfunda.com'
DECLARE @EmailWith_Ampersand_SpecialCharacter VARCHAR(50) = 'mail2&.netfunda@.netfunda.com'
DECLARE @EmailWith_LessThan_SpecialCharacter VARCHAR(50) = 'mail2<.netfunda@.netfunda.com'
DECLARE @EmailWith_GreaterThan_SpecialCharacter VARCHAR(50) = 'mail2>.netfunda@.netfunda.com'
DECLARE @EmailWith_AllThe3_SpecialCharacter VARCHAR(50) = 'mail2<.net>funda@&.netfunda.com'

--Try to insert the records into the table
INSERT INTO @T
VALUES(@SimpleEmail,CAST('<EmailCollections><Email>' + @SimpleEmail + '</Email></EmailCollections>' AS XML))
,(@EmailWith_Ampersand_SpecialCharacter,CAST('<EmailCollections><Email>' + REPLACE(@EmailWith_Ampersand_SpecialCharacter, '&', '&') + '</Email></EmailCollections>' AS XML))
,(@EmailWith_LessThan_SpecialCharacter,CAST('<EmailCollections><Email>' + REPLACE(@EmailWith_LessThan_SpecialCharacter, '<', '<') + '</Email></EmailCollections>' AS XML))
,(@EmailWith_GreaterThan_SpecialCharacter,CAST('<EmailCollections><Email>' + REPLACE(@EmailWith_GreaterThan_SpecialCharacter, '>', '>') + '</Email></EmailCollections>' AS XML))
,(@EmailWith_AllThe3_SpecialCharacter,CAST('<EmailCollections><Email>' + REPLACE(REPLACE(REPLACE(@EmailWith_AllThe3_SpecialCharacter, '<', '<'),'>','>'),'&', '&') + '</Email></EmailCollections>' AS XML))

--Project the records
SELECT *
FROM @T

Output

As we can make out that, the REPLACE function though does the job but is not so clean.Specially consider the last case where we used 3 Replace function!!!

.

A slightly cleaner approach would be to create a function that will remove the noises.The function will look as under

CREATE FUNCTION [dbo].[RemoveNoise] 
(
-- Add the parameters for the function here
@Input AS Varchar(100)
)
RETURNS VARCHAR(100)
AS
BEGIN
--Create a noise table
DECLARE @Noise TABLE(Noise VARCHAR(100),ReplaceChars VARCHAR(10))
INSERT INTO @Noise(Noise,ReplaceChars)
SELECT '<','&lt;' UNION ALL SELECT '>', '&gt;' UNION ALL
SELECT '&','"&"amp;'

-- Return the result of the function
SELECT @Input = REPLACE(@Input, Noise, ReplaceChars) FROM @Noise

RETURN @Input
END
GO

We can use this function as under

INSERT INTO @T  
VALUES(@SimpleEmail,CAST('<EmailCollections><Email>' + dbo.RemoveNoise(@SimpleEmail) + '</Email></EmailCollections>' AS XML))
,(@EmailWith_Ampersand_SpecialCharacter,CAST('<EmailCollections><Email>' + dbo.RemoveNoise(@EmailWith_Ampersand_SpecialCharacter) + '</Email></EmailCollections>' AS XML))
,(@EmailWith_LessThan_SpecialCharacter,CAST('<EmailCollections><Email>' + dbo.RemoveNoise(@EmailWith_LessThan_SpecialCharacter) + '</Email></EmailCollections>' AS XML))
,(@EmailWith_GreaterThan_SpecialCharacter,CAST('<EmailCollections><Email>' + dbo.RemoveNoise(@EmailWith_GreaterThan_SpecialCharacter) + '</Email></EmailCollections>' AS XML))
,(@EmailWith_AllThe3_SpecialCharacter,CAST('<EmailCollections><Email>' + dbo.RemoveNoise(@EmailWith_AllThe3_SpecialCharacter) + '</Email></EmailCollections>' AS XML))

Solution 2:Use CDATA

DECLARE @T TABLE (OldEmailAddress VARCHAR(50),NewEmailAddress XML)

--Declare 5 input variables
DECLARE @SimpleEmail VARCHAR(50) = 'mail2dotnetfunda@dotnetfunda.com'
DECLARE @EmailWith_Ampersand_SpecialCharacter VARCHAR(50) = 'mail2&.netfunda@.netfunda.com'
DECLARE @EmailWith_LessThan_SpecialCharacter VARCHAR(50) = 'mail2<.netfunda@.netfunda.com'
DECLARE @EmailWith_GreaterThan_SpecialCharacter VARCHAR(50) = 'mail2>.netfunda@.netfunda.com'
DECLARE @EmailWith_AllThe3_SpecialCharacter VARCHAR(50) = 'mail2>.net<funda@&.netfunda.com'

--Try to insert the records into the table
INSERT INTO @T
VALUES(@SimpleEmail,CAST('<Emails><Email><![CDATA[' + + @SimpleEmail + ']]></Email></Emails>' AS XML))
,(@EmailWith_Ampersand_SpecialCharacter,CAST('<Emails><Email><![CDATA[' + + @EmailWith_Ampersand_SpecialCharacter + ']]></Email></Emails>' AS XML))
,(@EmailWith_LessThan_SpecialCharacter,CAST('<Emails><Email><![CDATA[' + + @EmailWith_LessThan_SpecialCharacter + ']]></Email></Emails>' AS XML))
,(@EmailWith_GreaterThan_SpecialCharacter,CAST('<Emails><Email><![CDATA[' + + @EmailWith_GreaterThan_SpecialCharacter + ']]></Email></Emails>' AS XML))
,(@EmailWith_AllThe3_SpecialCharacter,CAST('<Emails><Email><![CDATA[' + + @EmailWith_AllThe3_SpecialCharacter + ']]></Email></Emails>' AS XML))

--Project the records
SELECT *
FROM @T

The output remains same.As can be seen that it is quite handy to use CData in this situation since no Replacement is needed.

Conslusion

So in this article, we have seen as how to handle special characters while storing them into XML data type.Hope this will be useful.Zip file is attached

Page copy protected against web site content infringement by Copyscape

About the Author

Niladri.Biswas
Full Name: Niladri Biswas
Member Level: Platinum
Member Status: Member
Member Since: 10/25/2010 11:04:24 AM
Country: India
Best Regards, Niladri Biswas
http://www.dotnetfunda.com
Technical Lead at HCL Technologies

Login to vote for this post.

Comments or Responses

Login to post response

Comment using Facebook(Author doesn't get notification)