Xml is a document to contain data from a program for another program. It is a file to store any amount of data in an organized manner. An xml document can be transferred from one place to another in a network. Hence it provides portability. The xml document contains the data as the values of tags and attributes. An xml document contains the data through the user-defined tags and attributes.
1.we can extend our ability to write the language using user-defined tags.
2.Xml is a cross platform, software and hardware independent markup language.
3.It is used to store data in a format that can be interpreted by any other computer system.
4.It is used to transfer structure data between heterogeneous systems.
5.It is used to retrieve data.
6.It has been developed by W3C (World wide web consortium)
Possible contents of Xml document
1.PI (processing instructions)
PI: -This is optional. It determines the xml version, encoding. It starts and ends with a question mark.
Ex: - <? Xml version=”1.0” encoding=”UTF-8” ?>
This is the first statement in the xml document
UTF- UCS (universal character set) transformation format
UTF-8: - It supports 8 bits character set (eg.English)
UTF-16: - It supports 16 bits character set (eg.Japanese)
Tags: - They are used to specify the name and information. It contains angular brackets in pairs line<> also ends with </> followed by the element.
Elements: - They are used to identify the type of data. This contains the data. Each tag present in the xml document is called as element.
Ex:- <name>Raj</name> here name is element.
Content:- They are respective information of the elements.
Ex:- <name>Raj</name> here Raj is content.
It is of three types they are
Attributes: - It provides additional information to the elements. An attribute can be optional or mandatory. An optional attribute can have default value.
Ex: - <prodname prodid=”p001”>Mobile phones</prodname> Here prodid is the attribute.
Entities:- This is a word to represent special symbols and a piece of text. This must be used in between & and; .It provides a shortcut to a set of information.
Ex: - <prodprice>The price of this product is >3000 </prodprice>
Here > represent greater than.
Comments:- These are remark statements or these are used to explain the code.It starts with exclamation followed by two hypens and end with two hypens.
<!--This is the information about various products-->
If an xml document satisfies all predefined rules then it is know as Welformed xml documents. The predefined rules are as follows
1.All elements must be enclosed in a root element.
2.The cases of the letters present in starting and ending tag must match.
3.All tags must have the heading tags.
4.All attribute value must be enclosed with in “/”.
<?xml version="1.0" ?>
<friend type="bestfriend" sex=”male”>
DTD (Document type definition)
1.It defines the structure of the content of xml document.
2.In DTD we specify various elements, attributes. This is similar to creation columns in a table.
3.It is similar to designing the table in the database.
Elements:- It defines the fields present in the Xml documents.
Syntax:- <!ELEMENT elementname(contenttype-content model)>
Element name- It specifies the name of the content.
Content type or content model- It determines whether the elements contains textual data or character data.
Rules for naming elements
1.It can contain letters, digits, underscore (_), hyphen (-) and period (.)
2.it must start with letters or underscore.
3.We can’t start with digits or hyphens or periods.
4.Any other symbol or character apart from this is not possible.
Types of Elements
Empty:- It does not contains any elements
Syntax:- <!ELEMENT element name EMPTY>
Unrestricted:- It contains any type of data
Syntax:- <!ELEMENT element name ANY>
Container:- It contains character data as well as element.
Syntax:- <!ELEMENT student(studentid,address)>
<!ELEMENT studentid (#PCDATA)>
Content type:- It specify the type of content the element can contain without having sub elements.
#EMPTY:- The element will not have separate ending tag.
#PCDATA:- The element will contain the text in between starting and ending tags which needs parsing.
#CDATA:- The element will contain the text in between starting and ending tags which need not to be parsed.
Occurrence type:- It specify the rule for appearance of sub element
*: - The sub element becomes optional
+: - The sub element must appear at least once and can appear more than once.
No symbol: - It must appear once and only once.
It provides additional information to the element.
<!ATTLIST element name attribute name value type attribute type “default”>
Value type is divided into three types
PCDATA :- It specifies value type as character data.
ID :- It specifies that value type as unique.
Enumerated:- It specifies a range of values for the attribute.
<!ATTLIST Product Proid Id Color(blue/green) “blue”>
Color (blue/green)- Enumerated
#REQUIRED: - It specifies that an attribute contains required value or the value is compulsory.
#FIXED: - It specifies that attribute contains fixed value it can’t be changed.
#IMPLIED: - It specifies that attribute contains optional value.
Type of DTD
They are of two types internal and external
1.This is a part of xml document.
2.It is used for the xml document in which it is present.
1.This is not a part of xml document.
2.It can be used across multiple documents.