Article | REF: H7288 V2

Text-To-Speech synthesis

Authors: Christophe D'ALESSANDRO, Gaël RICHARD

Publication date: November 10, 2013

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!


Overview

Français

ABSTRACT

This article provides an overview of speech synthesis from text or Text-To-Speech (TTS) in order to automatically calculate the speech signal corresponding to a given text. The various stages necessary in order to set up such a system are described, including the latest techniques such as those exploiting hidden Markov models. The various applications of speech synthesis and the principal offers in this domain are also discussed.

Read this article from a comprehensive knowledge base, updated and supplemented with articles reviewed by scientific committees.

Read the article

AUTHORS

  • Christophe D'ALESSANDRO: Research Director LIMSI-CNRS, Orsay, France

  • Gaël RICHARD: Professor Institut Mines-Télécom, Télécom ParisTech, CNRS-LTCI, Paris, France - This article is an updated version of the 2003 article of the same title by Gaël Richard and Olivier Cappé.

 INTRODUCTION

The aim of text-to-speech (or TTS, Text-To-Speech) is to automatically calculate the speech signal corresponding to a given text. The text itself can come from a variety of sources: newspapers, books, voice response systems, dialogue or automatic translation systems (interactive terminals, personal assistants), information system databases, video games, e-mails, SMS, documents browsed on the web, or simply text typed on a computer keyboard.

Voice response in its simplest form can be a set of pre-recorded messages (or "prompts"). Text-to-speech synthesis is more ambitious: it automatically calculates the sound samples corresponding to any written statement, which is not known in advance and may be large in size.

The two sides of speech synthesis are, on the one hand, text analysis and interpretation, and on the other, prediction of the acoustic-phonetic parameters of the sound and signal synthesis itself:

  • Text analysis: the first stage in transforming text into speech involves the ability to analyze and understand the written text, its nuances and connotations, the speech situation and the speech act to be performed. In addition to the text, the context can be specified (speaking style, emotion, attitude, character type, specific voice...);

  • Signal synthesis: once the text has been analyzed, the aim is to calculate the acoustic signal that best interprets the linguistic content, with a voice that sounds as natural as possible, resembling a particular speaker, and with the nuances of attitude and even emotion that the text calls for. In addition to the audio signal, the synthesizer can provide instructions for synchronizing the lip movements of an avatar or video character, or the movements of a robot.

You do not have access to this resource.

Exclusive to subscribers. 97% yet to be discovered!

You do not have access to this resource.
Click here to request your free trial access!

Already subscribed? Log in!


The Ultimate Scientific and Technical Reference

A Comprehensive Knowledge Base, with over 1,200 authors and 100 scientific advisors
+ More than 10,000 articles and 1,000 how-to sheets, over 800 new or updated articles every year
From design to prototyping, right through to industrialization, the reference for securing the development of your industrial projects

KEYWORDS

signal processing   |   linguistics


This article is included in

Digital documents and content management

This offer includes:

Knowledge Base

Updated and enriched with articles validated by our scientific committees

Services

A set of exclusive tools to complement the resources

Practical Path

Operational and didactic, to guarantee the acquisition of transversal skills

Doc & Quiz

Interactive articles with quizzes, for constructive reading

Subscribe now!

Ongoing reading
Text-based speech synthesis