Document Type Definition | |
---|---|
Filename extension | .dtd |
File formats category - |
A Document Type Definition (DTD) provides a grammar for a class of documents.[1] A document may contain an external DTD, an internal DTD, or both. A DTD is a type of XML schema, such as XSD.
Syntax[]
Elements[]
Elements in a document are considered valid by declaring the !ELEMENT
syntax. A line is started with "<!ELEMENT
", then the name of the element, then how its content is read, and ends with a ">
".
How its content is read can be either a listing of its possible content in parentheses, or an element type. Example:
An example of a valid XML using only this DTD:
<book>
<author />
<content />
</book>
<!-- A book element is valid, and it must have
the elements author and content as children -->
<!ELEMENT book (author,content)>
- Parsed character data
Elements with only text as content are specified by placing the keyword #PCDATA
in the parentheses. Example:
PCDATA: An example of a valid XML using only this DTD:
<content>Lorem ipsum dolor sit amet & consectetuer</content>
<!ELEMENT author (#PCDATA)>
- Empty elements
Elements that cannot have any content are specified using the keyword EMPTY
. Being a keyword for an element type, EMPTY
shouldn't be in parentheses. Example:
<!ELEMENT br EMPTY>
<!-- Means that a br element is valid, and must be empty,
e.g. <br /> or <br style="..." /> -->
- Any content
Elements that can contain anything are specified using the keyword ANY
. Example:
<!ELEMENT p ANY>
- Elements with children
Elements that has children must declare each child's name in parentheses, comma-separated. Refer to first example.
- Children - Exactly one instance of a child
A child may be declared as occurring in the document exactly once as a child of another as-is.
- Children - Optional child
A child may be declared as occurring in the document zero or one time by appending a question mark ("?") to its name. Example:
<!-- Here, glossary and index are optional elements -->
<!ELEMENT book (author, content, glossary?, index?)>
<!-- ... declarations of other elements ... -->
<!ELEMENT index (#PCDATA)>
- Children - At least one is required
A child may be declared as occurring in the document one or more times by appending a plus sign ("+") to its name. Example:
<!ELEMENT tbody (tr+)>
- Children - Any number of occurrences
A child may be declared as occurring in the document zero or more times by appending an asterisk ("*") to its name. Example:
<!ELEMENT tr (td*)>
- Options and grouping
Children may be grouped using parentheses. Options may be declared using a vertical bar ("|"), which separates the options that the content of the element may take. Example:
<!-- means that the key element may either have one n1 element or one n2 element. -->
<!ELEMENT key (n1|n2)>
<!-- means that the gpoint element must have
a title element and either point elements or an xref element. -->
<!ELEMENT gpoint (title,(point*|xref))>
Attributes[]
Attributes are defined using the #!ATTLIST
syntax. The syntax is:
<!ATTLIST <element name> <attribute name> <attribute type> <default value>>
The element name is the element to which the ATTLIST syntax is applied. The attribute name is a specific attribute to which the syntax is applied. (Note: undeclared attributes found in an element do not cause an error)
The attribute type may be one of the following:
- CDATA (character data) - the attribute is composed of just text.
- list of possible values for the attribute, vertical bar separated, in parentheses - specifies legal values for the attribute.
- ID - specifies that the value of the attribute must be unique for each element.
- IDREF - the attribute is a reference to the ID of another element.
- IDREFS - the attribute is a list of references to other IDs.
- xml:someidentifier - the attribute is a predefined XML command to be used by the XML parser.
- ENTITY - the attribute contains an entity.
- ENTITIES - the attribute contains entities.
- NMTOKEN - the attribute is a valid XML name.
- NMTOKENS - the attribute contains valid XML names.
- NOTATION - the attribute contains a notation.
The default value may be one of the following:
- value - specifies a default value to be used when the attribute is not set in the element.
- #REQUIRED - specifies that the attribute must be set.
- #IMPLIED - specifies that the attribute is optional; it may either be set or not.
- #FIXED value - specifies that the attribute must be set to the specified value.
Entities[]
An entity is a shortcut to a string to be replaced in the document upon presentation or other purposes. It is composed of an ampersand, a character combination (the name of the entity) and a semicolon. For example, the entity < is a shortcut to the less than sign, "<".
Entities are declared by the !ENTITY
syntax. The syntax is <!ENTITY <entity name> "<entity value>">
. Example:
<!ENTITY excl "!">
For example, an XML document that has the text
Hello World!
would be rendered as:
Hello World!
Entities can be referenced from an external DTD by using the SYSTEM
keyword: <!ENTITY <entity name> SYSTEM "<URI to external DTD>">
Document type[]
A DTD can be declared internally in a document using the !DOCTYPE
syntax. The syntax for internal DOCTYPE is: <!DOCTYPE <root element name> [ ...DTD syntax... ]>
. Example:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE book [
<!ELEMENT book (isbn, chapter+, glossary?, index?)>
<!ELEMENT isbn (#PCDATA)>
<!ELEMENT chapter (title, content+)>
<!ELEMENT content (#PCDATA)>
<!ELEMENT glossary (item+)>
<!ELEMENT index (item+)>
<!ELEMENT item (#PCDATA)>
<!ATTLIST item ref ID #IMPLIED>
]>
<book>
<isbn>1001234</isbn>
<chapter>
<title>
1
</title>
<content>
Lorem ipsum
</chapter>
<index>
<item ref="65">abcd</item>
</index>
</book>
A DTD in an external file (usually with a .dtd extension) can be used by using public identifiers, system identifiers, or both. Public identifiers take the form of a URI-like string, intended to be unique and to be used across many applications. System identifies take the form of a URI, which refers to a document type definition intended to be used in a single application.
An external DTD may be referenced using the following DOCTYPE syntax: <!DOCTYPE <root element name> PUBLIC <public identifier> SYSTEM <system identifier>>
. At least one identifier is required.
For example, the XML above may also be:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE book SYSTEM "http://example.com/chml.dtd">
<book>
<isbn>1001234</isbn>
<chapter>
<title>
1
</title>
<content>
Lorem ipsum
</chapter>
<index>
<item ref="65">abcd</item>
</index>
</book>
Notes[]
- DTD examples are not necessarily copied from actual standard DTDs that may exist.
References[]
- ↑ "Extensible Markup Language (XML) 1.0 (Fifth Edition)" (W3C) - definition of DTD