![](/assets/images/picto-drapeau-france-3a76576a5d60a512053b4612ab58dae5.png)
4. Resources for automatic text processing
NLP systems rely on several types of resources:
textual resources: textual data, known as corpora, are used to create test benches, to train learning systems, to extract lexical data, etc. ...;
lexical resources: lexicons form the core of the linguistic information used by a system. They vary in nature depending on the application, and incorporate more or less complex information, from simple word lists to structured semantic resources. Given the cost involved in building a lexicon for a given application, the trend is towards reusability and automatic acquisition of lexical data;
software resources: lemmatizers, segmenters and labelers are the basic building blocks of text processing. The complexity of a NLP application calls for the reusability of existing components; this...
Exclusive to subscribers. 97% yet to be discovered!
You do not have access to this resource.
Click here to request your free trial access!
Already subscribed? Log in!
![](/assets/images/logo-eti-286623ed91fa802ce039246e516e5852.png)
The Ultimate Scientific and Technical Reference
This article is included in
Digital documents and content management
This offer includes:
Knowledge Base
Updated and enriched with articles validated by our scientific committees
Services
A set of exclusive tools to complement the resources
Practical Path
Operational and didactic, to guarantee the acquisition of transversal skills
Doc & Quiz
Interactive articles with quizzes, for constructive reading
Resources for automatic text processing
Bibliography
Software tools
References for tools and resources cited in the article :
TreeTagger: morpho-syntactic labeling and lemmatization. University of Stuttgart http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/
Websites
Portals on language technologies and language resources
CNRTL Centre National de Ressources Textuelles et Lexicales http://www.cnrtl.fr/
ELRA – European Language Resources Association http://www.elra.info/
Language Technology World, DFKI...
Standards and norms
TEI (Text Encoding Initiative) consortium created in 1987 to produce recommendations for standardizing the encoding of digital documents http://www.tei-c.org
Events
TALN (Traitement Automatique des Langues Naturelles) international French-language conference organized annually since 1994 by the ATALA Association pour le Traitement Automatique des Langues.
Directory
Associations (non-exhaustive list)
ACL Association for Computational Linguistics http://www.aclweb.org/
ATALA Association for Automatic Language Processing http://www.atala.org/
APIL Association des Professionnels des Industries de la...
Exclusive to subscribers. 97% yet to be discovered!
You do not have access to this resource.
Click here to request your free trial access!
Already subscribed? Log in!
![](/assets/images/logo-eti-286623ed91fa802ce039246e516e5852.png)
The Ultimate Scientific and Technical Reference