4. Defining Web Document Types


  1. Document Types
  2. Document Type Definitions (DTDs)
  3. Valid XML
  4. Valid XHTML
  5. DTD syntax
  6. Examples of DTD element declarations
  7. DTD syntax
  8. DTD for RSS
  9. Validation of XML Documents
  10. Referencing a DTD
  11. Declaring an Internal DTD
  12. Declaring an External DTD (1)
  13. Declaring an External DTD (2)
  14. Attributes
  15. Some Attribute Types
  16. Attribute Defaults
  17. Mixed Content Models
  18. Some exercises
  19. Family example DTD
  20. Family example: XML document
  21. A valid family
  22. An invalid family
  23. Entities
  24. Example using Entities
  25. General Entities
  26. Parameter Entities
  27. Limitations of DTDs
  28. Exercises
  29. Links to more information

4.1. Document Types

4.2. Document Type Definitions (DTDs)

4.3. Valid XML

XML parser checking document is valid

4.4. Valid XHTML

XHTML parser checking document is valid

4.5. DTD syntax

4.6. Examples of DTD element declarations

4.7. DTD syntax

DTD Syntax Meaning
b element b must occur
b,c both b and c must occur, in the order specified
b|c one (and only one) of b or c must occur
b* zero or more occurrences of b must occur
b+ one or more occurrences of b must occur
b? zero or one occurrence of b must occur
EMPTY no element content is allowed
ANY any content (of declared elements and text) is allowed
#PCDATA content is text rather than an element

#PCDATA stands for "parsed character data", meaning an XML parser should parse the characters to resolve character and entity references.

4.8. DTD for RSS

4.9. Validation of XML Documents

4.10. Referencing a DTD

4.11. Declaring an Internal DTD

<?xml version="1.0"?>
<!DOCTYPE rss [
    <!-- all declarations for rss DTD go here -->
    ...
    <!ELEMENT rss ... >
    ...
]>
<rss>
   <!-- This is an instance of a document of type rss -->
   ...
</rss>

4.12. Declaring an External DTD (1)

<?xml version="1.0"?>
<!DOCTYPE rss SYSTEM "rss.dtd">
<rss>
   <!-- This is an instance of a document of type rss -->
   ...
</rss>

4.13. Declaring an External DTD (2)

<?xml version="1.0"?>
<!DOCTYPE math PUBLIC "-//W3C//DTD MathML 2.0//EN"
     "http://www.w3.org/TR/MathML2/dtd/mathml2.dtd">
<math>
   <!-- This is an instance of a mathML document type -->
   ...
</math>

Formal public identifiers are meant for widely used entities. They should be unique world-wide. Processing software might either come with such entities already installed or it might know the most efficient sites form which to download them. If not, the URI is used to retrieve the DTD.

4.14. Attributes

4.15. Some Attribute Types

4.16. Attribute Defaults

4.17. Mixed Content Models

4.18. Some exercises

4.19. Family example DTD

<!ELEMENT family (parent, (parent)?, (child)*)>
<!ELEMENT parent (name)>
<!ELEMENT child  (name)>
<!ELEMENT name   (#PCDATA)>

<!ATTLIST parent
  pno     ID               #IMPLIED
  role    (mother|father)  #IMPLIED
  spouse  IDREF            #IMPLIED>

<!ATTLIST  child
  cno           ID      #IMPLIED
  date-of-birth CDATA   #IMPLIED
  siblings      IDREFS  #IMPLIED>

4.20. Family example: XML document

<?xml version="1.0"?>
<!-- <!DOCTYPE family [ ... DTD goes here ... ]> -->
<family>
  <parent pno="p1" role="mother" spouse="p2">
    <name>Janet</name>
  </parent>
  <parent pno="p2" role="father" spouse="p1">
    <name>John</name>
  </parent>
  <child cno="c1" siblings="c2 c3">
    <name>Tom</name>
  </child>
  <child cno="c2" siblings="c1 c3">
    <name>Dick</name>
  </child>
  <child cno="c3" siblings="c1 c2">
    <name>Harry</name>
  </child>
</family>

4.21. A valid family

<?xml version="1.0"?>
<!-- <!DOCTYPE family [ ... DTD goes here ... ]> -->
<family>
  <parent pno="janet">
    <name>Janet</name>
  </parent>
  <child date-of-birth="yesterday">
    <name>Tom</name>
  </child>
</family>

4.22. An invalid family

<family>
  <parent role="stepmother" spouse="john jim">
    <name>Janet</name>
  </parent>
  <parent pno="john" spouse="janet"></parent>
  <parent pno="jim" spouse="janet"></parent>
</family>

4.23. Entities

4.24. Example using Entities

<!DOCTYPE xmas [
<!ENTITY  on        "On the">
<!ENTITY  day       "day of Christmas my true love sent to me">
<!ENTITY  partridge "<line>a partridge in a pear tree.</line>">
<!ENTITY  doves     "<line>two turtle doves and</line> &partridge;">
<!ENTITY  hens      "<line>three French hens,</line> &doves;">
<!ELEMENT xmas      (verse+)>
<!ELEMENT verse     (line+)>
<!ELEMENT line      (#PCDATA)>
]>
<xmas>
  <verse><line>&on; first  &day;</line> &partridge;</verse>
  <verse><line>&on; second &day;</line> &doves;</verse>
  <verse><line>&on; third  &day;</line> &hens;</verse>
</xmas>

4.25. General Entities

4.26. Parameter Entities

4.27. Limitations of DTDs

4.28. Exercises

  1. Write an XML DTD which will define the following structure for documents of type exam. An exam has a course code, a title and a date, which comprises only the month and year. These are followed by a list of questions. Exams consist of either 5 or 6 questions. Each question has one or more parts. Parts of questions can themselves comprise parts along with text.

    Give an instance of an exam document which is valid with respect to your DTD and two instances which are invalid, explaining why they are invalid. Check your answers using an on-line XML validator.


  2. Write an XML DTD for representing information about students on an MSc programme. All information should be represented using elements rather than attributes. The root element of the document is programme. A programme has a degree and a year. These elements are followed by the results for the programme. The results are partitioned into distinction, merit, pass and fail. Within each is a sequence of name elements, each containing the name of a person having achieved the corresponding result for the programme.


  3. Consider a relational database containing a relation teaches with attributes course and lecturer, representing the relationship between courses taught on an MSc programme and the lecturers who teach them. Give an XML DTD for representing this information.

4.29. Links to more information

DTDs are covered in Chapter 4 of [Moller and Schwartzbach] and briefly in Chapter 2 of [Jacobs].