Article | REF: H7158 V1

TEI (Text Encoding Initiative)

Author: François ROLE

Publication date: August 10, 1999, Review date: July 1, 2018

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!


Overview

Français

Read this article from a comprehensive knowledge base, updated and supplemented with articles reviewed by scientific committees.

Read the article

AUTHOR

  • François ROLE: Library curator - Research Fellow, University of Paris 8

 INTRODUCTION

Since ancient times, it has been common practice to mark and annotate texts in order to facilitate their study or criticism (think of medieval annotation systems, or the apparatus of symbols devised as early as the III e century BC by Alexandrian philologists).

In the digital world, electronic mark-up (defined here as the insertion into an electronic file of markings that are linked to, but not directly part of, the text) has long been used almost exclusively to drive printing or display devices (photocopiers, printers, screens). It is this markup that is implicitly (*) used by most researchers in the humanities and social sciences through commercial DTP tools.

Note :

(*) "implicitly" in the sense that manipulations carried out via the keyboard or pointing devices somehow generate the physical mark-up information on which the DTP software relies to carry out the operations it is asked to perform.

Despite its merits, this markup is, as we said, oriented towards text production or display, and is therefore not designed to facilitate intellectual exploration of documents. Gradually, therefore, the idea emerged that we needed to resort to a markup level less dependent on production constraints, and conducive to higher-level processing of texts, because it describes their logical structure.

SGML (Standard Generalized Markup Language) is currently the most widely used standard for logically tagging texts. It allows any user to define a logical markup language adapted to their needs, by writing a DTD (Document Type Definition).

The Text Encoding Initiative (TEI) is an SGML DTD accompanied by a volume of "recommendations"; the TEI "Guidelines" explain how the DTD should be used. This DTD is tailored primarily to the needs of the humanities and social sciences research community (or more generally to any researcher wishing to explore vast textual corpora in electronic form). It enables linguists to syntactically tag corpora, historians to mark dates, place names or characters in a text, literary researchers to study the stylistics or genesis of a text, and so on.

After some historical background and an informal presentation of the structure of a TEI text, we describe the mechanisms implemented in writing the TEI DTD (modularity, inheritance, extensibility).

This part is more technical than the others, and requires a good knowledge of SGML.

At the end of this article, we present a few examples of TEI tagging.

SGML concepts and techniques are described in the "SGML" article .

You do not have access to this resource.

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!


The Ultimate Scientific and Technical Reference

A Comprehensive Knowledge Base, with over 1,200 authors and 100 scientific advisors
+ More than 10,000 articles and 1,000 how-to sheets, over 800 new or updated articles every year
From design to prototyping, right through to industrialization, the reference for securing the development of your industrial projects

This article is included in

Digital documents and content management

This offer includes:

Knowledge Base

Updated and enriched with articles validated by our scientific committees

Services

A set of exclusive tools to complement the resources

Practical Path

Operational and didactic, to guarantee the acquisition of transversal skills

Doc & Quiz

Interactive articles with quizzes, for constructive reading

Subscribe now!

Ongoing reading
TEI (Text Encoding Initiative)