Speech synthesis is the generation of human speech without directly using a human voice. Speech synthesis systems are often called text-to-speech (TTS) systems in reference to their ability to convert text into speech. However, there exist systems that can only render symbolic linguistic representations like phonetic transcriptions into speech. A text-to-speech system is composed of two parts: a front end and a back end. Broadly, the front end takes input in the form of text and outputs a symbolic linguistic representation. The back end takes the symbolic linguistic representation as input and outputs the synthesized speech waveform. The naturalness of a speech synthesizer usually refers to how much the output sounds like the speech of a real person. (more...)
Recently featured: Carl Sagan – Simon and Garfunkel – Helium