Overview
ABSTRACT
This article provides an overview of speech synthesis from text or Text-To-Speech (TTS) in order to automatically calculate the speech signal corresponding to a given text. The various stages necessary in order to set up such a system are described, including the latest techniques such as those exploiting hidden Markov models. The various applications of speech synthesis and the principal offers in this domain are also discussed.
Read this article from a comprehensive knowledge base, updated and supplemented with articles reviewed by scientific committees.
Read the articleAUTHORS
-
Christophe D'ALESSANDRO: Research Director LIMSI-CNRS, Orsay, France
-
Gaël RICHARD: Professor Institut Mines-Télécom, Télécom ParisTech, CNRS-LTCI, Paris, France - This article is an updated version of the 2003 article of the same title by Gaël Richard and Olivier Cappé.
INTRODUCTION
The aim of text-to-speech (or TTS, Text-To-Speech) is to automatically calculate the speech signal corresponding to a given text. The text itself can come from a variety of sources: newspapers, books, voice response systems, dialogue or automatic translation systems (interactive terminals, personal assistants), information system databases, video games, e-mails, SMS, documents browsed on the web, or simply text typed on a computer keyboard.
Voice response in its simplest form can be a set of pre-recorded messages (or "prompts"). Text-to-speech synthesis is more ambitious: it automatically calculates the sound samples corresponding to any written statement, which is not known in advance and may be large in size.
The two sides of speech synthesis are, on the one hand, text analysis and interpretation, and on the other, prediction of the acoustic-phonetic parameters of the sound and signal synthesis itself:
Text analysis: the first stage in transforming text into speech involves the ability to analyze and understand the written text, its nuances and connotations, the speech situation and the speech act to be performed. In addition to the text, the context can be specified (speaking style, emotion, attitude, character type, specific voice...);
Signal synthesis: once the text has been analyzed, the aim is to calculate the acoustic signal that best interprets the linguistic content, with a voice that sounds as natural as possible, resembling a particular speaker, and with the nuances of attitude and even emotion that the text calls for. In addition to the audio signal, the synthesizer can provide instructions for synchronizing the lip movements of an avatar or video character, or the movements of a robot.
Exclusive to subscribers. 97% yet to be discovered!
You do not have access to this resource.
Click here to request your free trial access!
Already subscribed? Log in!
The Ultimate Scientific and Technical Reference
KEYWORDS
signal processing | linguistics
This article is included in
Digital documents and content management
This offer includes:
Knowledge Base
Updated and enriched with articles validated by our scientific committees
Services
A set of exclusive tools to complement the resources
Practical Path
Operational and didactic, to guarantee the acquisition of transversal skills
Doc & Quiz
Interactive articles with quizzes, for constructive reading
Text-based speech synthesis
Bibliography
- (1) - SPROAT (R.), MOEBIUS (B.), MAEDA (K.), TZOUKERMANN (E.) - Multilingual text analysis. - Dans Multilingual Text-To-Speech Synthesis – The Bell Labs Approach, SPROAT (R.) et coll. éd., Kluwer Academic Publishers (1998). Ce livre décrit en détail les procédures de synthèse de l'anglais et d'autres langues, et donne une introduction au...
Commercial data
-
Acapela http://www.acapela-group.com/
Acapela is the new name of the group formed by BaBel Technologies SA and Babel-Infovox AB, which also absorbed ELAN speech. Acapela offers a wide range of multilingual synthesis solutions, initially developed through research at the Royal Institute of Technology...
Magazines, conferences
Recent developments in the techniques presented here are regularly the subject of articles.
In journals devoted to speech processing:
• IEEE Transactions on Speech and Audio Processing ;
• Speech Communication ;
• Computer Speech and Language ;
• EURASIP Journal...
Exclusive to subscribers. 97% yet to be discovered!
You do not have access to this resource.
Click here to request your free trial access!
Already subscribed? Log in!
The Ultimate Scientific and Technical Reference