Article | REF: H7248 V1

Documentary databases

Authors: Claude CHRISMENT, Jacques LE MAITRE, Florence SÈDES

Publication date: May 10, 2000

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!


Overview

Français

Read this article from a comprehensive knowledge base, updated and supplemented with articles reviewed by scientific committees.

Read the article

AUTHORS

  • Claude CHRISMENT: Doctor of Science - Professor of Computer Science at Toulouse III Paul-Sabatier University

  • Jacques LE MAITRE: Qualified to direct research - Professor of Computer Science at the University of Toulon and Var

  • Florence SÈDES: Qualified to direct research - Senior lecturer in computer science at Toulouse II University

 INTRODUCTION

Document applications are based on the storage function, which must be integrated with other functionalities enabling exploration, partial re-use of stored document content, and sometimes even restructuring. For example, all IT applications linked to the testing, integration and maintenance of structured objects – assembly of components – can be cited, whether in the context of software engineering (software components), space (satellite integration: satellite components), aerospace (aircraft components), and so on. Typically, components are described in specification manuals, which have to be reused and adapted as part of integration, testing and maintenance activities. The problems associated with the multiplicity of heterogeneous data sources have become even more acute with the rise of the Web. Integration tools and models are needed to provide an abstract and synthetic vision, and to make these large volumes of data accessible and easy to manipulate.

The implementation of such electronic document management systems generally requires the use of database management systems to perform the interdependent functions of storing and accessing information. Electronic documents are generally accessed and searched in three different ways. The first, essentially used for textual data, consists in searching for a string – more generally a pattern – in a text: this is found in information retrieval systems that implement "full-text" indexing and textual matching mechanisms. The second relies on a priori knowledge of a total structure defined on the data being manipulated: this is found in database management systems, where it is implemented through the database schema and a query language based on a finite set of operators. The third implements scanning and navigation mechanisms on weakly structured information. This is found in hypertext systems and, in particular, on the Web. All three approaches must be supported by any electronic document management system.

The concept of document is associated with that of semi-structured information, which is characterized by its total or partial absence of structure, from completely unstructured information to semi-structured information, as well as its heterogeneity: multiplicity of formats, formalisms, structures, types, media, etc. Documents are stored in a warehouse, or document base, which supports interrogation and manipulation, via indexing, filtering and retrieval operators. Documents are stored in a warehouse, or document base, which can be queried and manipulated using indexing, filtering and retrieval operators. The modeling of any document base must be generic, scalable, independent of the level of granularity of document units and representation standards.

The first part of this article presents...

You do not have access to this resource.

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!


The Ultimate Scientific and Technical Reference

A Comprehensive Knowledge Base, with over 1,200 authors and 100 scientific advisors
+ More than 10,000 articles and 1,000 how-to sheets, over 800 new or updated articles every year
From design to prototyping, right through to industrialization, the reference for securing the development of your industrial projects

This article is included in

Digital documents and content management

This offer includes:

Knowledge Base

Updated and enriched with articles validated by our scientific committees

Services

A set of exclusive tools to complement the resources

Practical Path

Operational and didactic, to guarantee the acquisition of transversal skills

Doc & Quiz

Interactive articles with quizzes, for constructive reading

Subscribe now!

Ongoing reading
Documentary databases