This is the master version of an original document.
Today, documents are usually prepared electronically using a word processor such as Word or OpenOffice. Such programs allow their users to make good-looking documents easily and quickly. However, there are problems associated with the multitude of different formats and programs used to produce documents. For instance:
OUCS has adopted an open, vendor independent format approach to maintain our documentation in an accessible and interchangeable format. Our system uses XML or eXtensible Markup Language to store documents. XML allows the user to develop their own rules to code up their documents. However, there are already many different versions of XML rules available so we do not need to develop anything new for OUCS. Our system uses a modified version of the Text Encoding Initiative (TEI) XML for writing documentation.
The Text Encoding Initiative (TEI) Guidelines are an international and interdisciplinary standard that enables libraries, museums, publishers, and individual scholars to represent a variety of literary and linguistic texts for online research, teaching, and preservation.
The TEI standard is maintained by a consortium of leading institutions and projects worldwide; Oxford is one of these institutions. Two of the major players in the TEI are members of OUCS: Lou Burnard and Sebastian Rahtz. Lou joined the Text Encoding Initiative project as its European Editor back in 1989 (a post he still holds), while Sebastian is one of the consortium's directors and actively develops the TEI itself.
Since 2002 it has been law to provide documents (including web pages) in accessible formats to users of alternative technologies such as screen readers. The relevant legislation is the Special Educational Needs and Disability Act (SENDA) 2001 which is part 4 of the Disability discrimination Act (DDA). This act brought Education establishments into line with commercial providers in the way that they provide information and services to the disabled community.
The W3C organisation have created various standards for web accessibility. These are:
The following document includes details on how to make your XML documents accessible to as wide an audience as possible. Please make sure that you follow these accessibility guidelines - it's the LAW!
The rules of the TEI XML format are stored in a schema (we use the RELAXNG schema language) file. This file defines the structure of how XML is to be written and is the key to transforming the text from one format to another. In order to write a valid TEI XML document the schema has to be followed. Luckily there are many XML editors that look after the schema for you and show any errors when the document is tested against the schema.
An XSLT Stylesheet or Extensible Stylesheet Language Transformation Stylesheet is basically a set of rules to process a XML document. It turns an XML rendition of a file into the final version of a file. OUCS uses two versions of XSLT files, one turns an XML file into a web page (HTML format), the other turns XML into PDF format for printing.
CSS or Cascading Style Sheets are files containing information on how a document is to be presented e.g. bold, red headings or grey backgrounds. There are two versions used by OUCS: one displays the XML file directly and is fairly simple; the other displays the final web page and is fairly complex.
Before submitting your file to Subversion, you should check your document's syntax. Most XML editors have facilities to check the validity of your document against your schema. Make any corrections necessary before submitting the file to the main Subversion repository. Also bear in mind that your document should be fully accessible and SENDA compliant.
<title>and is closed by the end-tag
</title>. Any text between the start and end tags is therefore defined as the title of the document. Most XML tags work in this way: a start tag, some text, followed by an end tag. There are some elements that are self closing (i.e. they have no end tag); where appropriate these will be highlighted later in this document.
All elements can have additional properties beside the element name and content.
These properties are the attributes of an element and they consist of
name-value pairs. For example a
<div> element can have the attribute
id="xxx", where xxx represents a name or number. In the example
id is 'email':
XML is very strict on its element structure, especially compared to HTML. In XML,
tags usually have to be started and ended. They must be nested properly and used in
the correct place within the document hierarchy. This generally means that you
cannot open a new tag e.g.
<p> without closing the previous
(N.B. there are exceptions to this rule e.g. self-closing tags).
Markup for OUCS documents Sebastian Rahtz May 2009 email@example.com This is the master version of an original document. $LastChangedDate: 2013-12-06 13:15:07 +0000 (Fri, 06 Dec 2013) $ $LastChangedBy: rahtz $ $LastChangedRevision: 155137 $
First comes the declaration that the file is a TEI document
<TEI.2>. This is
effectively the start tag for the document, all other elements must be correctly
arranged or nested inside the
<TEI.2> tags for the document to be
valid TEI XML.
The first element inside
<TEI.2> is the
<teiHeader> element. Everything
within this element is part of the document's Metadata (Metadata is data
about the document, e.g. its title, author, creation date etc.). OUCS documents have a
number of fields in the
<teiHeader>; some have to be manually completed, such as
the title of the document, while others are automatically added on document submission
Last changed by information. Usually, when writing your own documents,
you should complete the following metadata elements:
Like HTML, XML relies on elements to code up the document. If you are familiar with coding HTML files the transition to XML should be fairly painless. OUCS XML has many elements available for use, although in any one document only a subset of these will ever be applied. In this section we discuss the elements making up the body of a text.
Your text may be just a series of paragraphs, or these paragraphs may be grouped
together into chapters, sections, subsections, etc. In the former case, each paragraph
is embedded inside a the
<p> element. In the latter case, the
be divided into a series of
<div> elements, which may be further subdivided. An
example of div structure is shown below:
Sectioning your document has important effects on the OUCS web site. Each div used is processed when the document is converted into html. Major divisions are treated as separate web pages and help to form the basis of the internal page navigation system. Each division is also sequentially numbered: 1, 2, 3 ... Where a div section is within another div, it is treated as a subsection and numbered accordingly e.g. 2.1, 2.2, 2.3....
Sectioning documents also influences the HTML output to browsers. The title of a
document is always given the
<h1> tag, major divisions are thus given the
<h2> tag and minor section divisions are given
<h5> etc. depending on how deep they are nested within the document.
Correct structural markup for documentation is important for accessibility. When documents are marked up in a structured way, they allow users of alternative technologies to discover the main sections and subsections more quickly and more easily. The structure allows users to jump from one section to another, without the need to read all of the information on the page. Documents that do not use structured markup pose a problem (to screen reader users in particular), as it is very difficult to find out what is on a page without reading all of the text. Where structural markup has not been used, the author has often employed styles (bold, italic, etc.) to indicate different sections and headings. While obvious to sighted readers, the structure is lost to screen reader users who must read the page to find out if it is of interest to them.
<div>may have a title or heading, and (less commonly) a closing such as ‘End of Chapter 1’. The following elements may be used to mark them up:
Highlighted words or phrases are those made visibly different from the rest of the text, typically by a change of type font, handwriting style, or ink color, intended to draw the reader's attention to them.
<lb/>element marks the start of a new (typographic) line.
Explicit cross references or links from one point to another in a text in the same XML document may be encoded using the elements described in section . Simple Cross References. References or links to elements of some other XML document, or to parts of non-XML documents, may be encoded using the TEI extended pointers described in section . Extended Pointers.
Accessibility of your links is important. The text you use can either enhance a user's understanding of where the link will lead, or leave them clueless. The worst phrase you can use for a link is Click Here or simply Here: in both instances the user is left with no clear idea of where the link could lead. This problem is compounded for a screen reader user: they can get lists of all links from any given page, but if the author of the page has just said Click Here or Here, they will get a list consisting of just that. The user will be left stranded on the page with no clear way to move forwards in their search for information.
An accessible link is one that conveys both where the link will go and the information the user is likely to find. By default our system will add a title attribute to any link you make on your page when it is transformed into HTML. However, while this is good practice and a nice failsafe measure, it will only add the same text as the link text. This might be adequate in some circumstances, but to make your links more accessible you should add your own additional text using the n attribute. People browsing with modern visual browsers will see your additional link information when they mouse over your link, and screen reader users will have more information about where the link will take them as the title attribute is read out to them.
The difference between these two elements is that
<ptr> is an empty element,
simply marking a point from which a link is to be made, whereas
contain some text as well --- typically the text of the cross-reference itself. The
<ptr> element would be used for a cross reference which is indicated by a
symbol or icon, or in an electronic text by a button.
The value of the target attribute must be present in the current
XML document. This implies that the passage or phrase being pointed at must bear an
identifier, and must therefore be tagged as an element of some kind. In the following
example, the cross reference is to a
<div> element: ...see especially .... ...
The id attribute is global (i.e. can be used on any element), which means all elements in a document can be pointed to in this way. In the following example, a paragraph has been given an identifier so that it may be pointed at: ...this is discussed in the paragraph on links ...Links may be made to any kind of element ...
Sometimes the target of a cross reference does not correspond with any particular
feature of a text, and so may not be tagged as an element of some kind. If the desired
target is simply a point in the current document, the easiest way to mark it is by
<anchor> element at the appropriate spot.
<ref>can only be used for cross-references whose targets occur within the same XML document as their source. They can also refer only to XML elements. The elements discussed in this section are not restricted in these ways.
In addition to the attributes already discussed in section . Simple Cross References above, these elements share the following additional attribute, which is used to specify the target of the cross reference or link:See local information about email clients or go to faults, problems, or special requests
<address>element is used to mark a postal address of any kind. It contains one or more
<addrLine>elements, one for each line of the address.
<list>is used to mark any kind of list. A list is a sequence of text items, which may be ordered, unordered, or a glossary list. Each item may be preceded by an item label (in a glossary list, this label is the term being defined):
Individual list items are tagged with
<item>. The first
optionally be preceded by a
<head>, which gives a heading for the list. The
numbering of a list may be omitted (if reconstructible), indicated using the
n attribute on each item, or (rarely) tagged as content using the
<label> element. In order to achieve the same result with different browsers,
the value of n should be greater than 0.
An unordered listAn ordered list First item in list Second item in list Third item in list
An ordered listAn ordered list with controlled numbering First item in list Second item in list Third item in list
An ordered list with controlled numberingAn ordered list with letters for labels First item in list Second item in list Third item in list
An ordered list with letters for labelsAn ordered list with controlled lettering First item in list Second item in list Third item in list
An ordered list with controlled letteringA glossary list One First item in list Two Second item in list Three Third item in list
A glossary listVocabulary nu now lhude loudly bloweth blooms med meadow wude wood awe ewe lhouth lows sterteth bounds, frisks verteth pedit murie merrily swik cease naver never
Caution is advised when using tables as it is very easy to make them inaccessible to users of alternative technologies e.g. screen readers. It is your responsibility to make sure that any table used is comprehensible when it is linearised and that it contains suitable accessibility attributes.
Screen readers linearise tables when they are reading the content out to the user. This means that if you have failed to take this into account when designing your table, the screen reader user will not understand the content of your table. To check to see how your table will be read out, go to http://wave.webaim.org/. Run your page containing the table through this online checker. It will show you how the table will be read to a screen reader user.
All tables should be given the summary attribute regardless of whether they are for data or page layout. For data tables a short summary of the table content must be added for accessibility. Where a table is used for layout, the summary attribute is included, but left empty.table shows the rise and fall of mortality figures during the plague years 1 2 3 St. Leonard's, Shoreditch 64 84 119 St. Botolph's, Bishopsgate 65 105 116 St. Giles's, Cripplegate 213 421 554
<table> element has a
rend attribute with the value
the table will be rendered with the cells of the first column sorted
and with buttons on each column that enable the person viewing the page
to sort the table on another column.
|First Name||Last Name||Age||Total||Discount||Difference||Date and time||ISO||UK 1||UK 2|
|Peter||Parker||28||£9.99||20.9%||+12.1||Sep 9, 2002 8:14 AM||2002-09-09||09-09-2002||09/09/2002|
|John||Good||33||£19.99||125%||+12||Jan 12, 2003 5:14 AM||2003-01-12||12-01-2003||12/01/2003|
|Clark||Kent||18||£15.89||44%||-26||Jan 18, 2001 11:14 AM||2001-01-18||18-01-2001||18/01/2001|
|Bruce||Almighty||45||£153.19||44.7%||+77||Sep 10, 2002 9:12 AM||2002-09-10||10-09-2002||10/09/2002|
|Bruce||Evans||22||£13.19||11%||-100.9||Sep 1, 2002 9:12 AM||2002-09-01||01-09-2002||01/09/2002|
There are two ways in which the use of tablesorter can be customised. You will also find the documentation for tablesorter useful.
has the following definition for the template
In the XSL for the micro site, you define a template that overrides this.
Not all the components of a document are necessarily textual. The most straight forward text will often contain diagrams or illustrations, to say nothing of documents in which image and text are inextricably intertwined, or electronic resources in which the two are complementary. This poses accessibility issues for users who cannot see the images. What are they? Are they important to the text, or just page decoration? Is the image a graph or simple picture? Has the author provided extra information about the graphic for those that cannot see it? If you do not provide alternative text for graphics or other accessibiity features in the page coding, the page will be inaccessible to some visitors.
Usually, a graphic will have at the least an identifying title, which should be encoded
<head> element. Images which are given a head tag have this text
automatically converted to a figure caption and are numbered sequentially throughout the
document. It is also essential to include a brief description of the image using
<figDesc>. If the image is difficult to describe in just a few words, you
should provide an alternative page where a full account of the image can be given to the
user: this extra information should be provided via a [d] link. These are
normal url links to normal web pages. By convention the [d] link should be
provided next to the image in question; users needing greater detail about a given image
will click on the [d] link for more information.
If the image is for decoration only (very rare on OUCS pages), it is still necessary to
<figDesc> element in your document, but in this case it should be
left blank. By convention the image is then considered just page decoration and
unimportant to the reader.
If you want to control the way text flows around an image, use a rend value, as described in the Rends section.
A newsfeed can be displayed by putting a
The url attribute has the URL of the newsfeed.
Our XSL can cope with newsfeeds written in RSS 2.0, RSS 1.0 and Atom 1.0.
Gotcha: the web page will not change when new items get added to the feed unless you arrange for your page not to be cached by AxKit. Please contact firstname.lastname@example.org to get this done.
Here the rend attribute has a component that starts with
This is followed by some notation (e.g.,
that indicates how you want the date formatted.
It uses the same notation that is used by
PHP for its date function
with the addition of one character:
_ means generate a space.
You can then use an HTML element (e.g., the
by prefixing its name with the namespace
Here is another example:
var GCS_due_date = "";
If you want some HTML elements to appear in the
element of the HTML that gets generated,
you should put these elements between the
(that appear in the
Suppose you do wish to add an HTML
The rules of HTML say the
must finish up in the
<head> element of the resulting HTML.
So to achieve this, use something like:
It is possible to provide a form (in a TEI file) that collects some data from a user and sends that data to someone in an e-mail message. There are details about this in a document on FormMail.
Accessibility of our documentation is paramount to ensure documents are accessible to all readers and for OUCS to stay on the correct side of the law. It is necessary for all OUCS authors to familiarise themselves with the ways and means to make their documents as accessible as possible.
<figDesc>element. If necessary go the extra step and make a [d] link for longer explanations of figures
When an index or table of contents is to be encoded (rather than one being generated)
for some reason, the
<list> element discussed in section . Lists
should be used.
Rend values can be used to define how an element is rendered on the webpage, for example aligning items to the left or right of a page, allowing text to flow around images or stating that a bit of text should be in italics or red. Some of the more common rends available for use with the OUCS webpages are listed below.
If there is a particular style you need on your pages that is not currently available, please contact email@example.com for help.
<row rend=”label”>makes the background light blue whereas
<row role=”label">makes the background grey and the text white and centred
|"||"||double quotation mark|
Any other characters which are not on your keyboard can either be entered as numeric entities (see, eg, http://www.tedmontgomery.com/tutorial/HTMLchrc.html) or using UTF-8. How you enter UTF-8 on your keyboard depends on your application or operating system. oXygen, for example, has a facility Edit/Enter from Character Map to let you enter characters which are not on the keyboard.