Speech Prosody in Speech Synthesis: Modeling and generation of prosody for high quality and flexible speech synthesis, 2015 Prosody, Phonology and Phonetics Series

Langue : Anglais

Couverture de l’ouvrage Speech Prosody in Speech Synthesis: Modeling and generation of prosody for high quality and flexible speech synthesis

Résumé
Sommaire
Biographie
Commentaire

The volume addresses issues concerning prosody generation in speech synthesis, including prosody modeling, how we can convey para- and non-linguistic information in speech synthesis, and prosody control in speech synthesis (including prosody conversions). A high level of quality has already been achieved in speech synthesis by using selection-based methods with segments of human speech. Although the method enables synthetic speech with various voice qualities and speaking styles, it requires large speech corpora with targeted quality and style.

Accordingly, speech conversion techniques are now of growing interest among researchers. HMM/GMM-based methods are widely used, but entail several major problems when viewed from the prosody perspective; prosodic features cover a wider time span than segmental features and their frame-by-frame processing is not always appropriate. The book offers a good overview of state-of-the-art studies on prosody in speech synthesis.

Modeling of Prosody.- ProZed: A speech prosody editor for linguists, using analysis-by-synthesis.- On degree of freedom in prosody modeling.- Extraction, analysis and synthesis of Fujisaki model parameters.- Probabilistic modeling of pitch contours towards prosody synthesis and conversion.- Para- and non-linguistic issues of prosody.- Communicative speech synthesis as pan-linguistic prosody control.- Mandarin stress analysis and prediction for speech synthesis.- Expressivity in interactive speech synthesis; some para-linguistic and non-linguistic issues of speech prosody for conversational dialogue systems.- Temporally variable multi-attribute morphing of arbitrarily many voices for exploratory research of speech prosody.- Control of prosody in speech synthesis.- Statistical models for dealing with discontinuity of fundamental frequency.- Use of generation process model for improved control of fundamental frequency contours in HMM-based speech synthesis.- Tone Nucleus Model for Emotional Mandarin Speech Synthesis.- Emphasis, word prominence, and continuous wavelet transform in the control of HMM based synthesis.- Exploiting alternatives for text-to-speech synthesis: from machine to human.- Prosody control and variation enhancement techniques for HMM-based expressive speech synthesis.

Professor Keikichi Hirose received the B. E. degree in electrical engineering in 1972, and the M. E. and Ph. D. degrees in electronic engineering respectively in 1974 and 1977 from the University of Tokyo. From 1977, he is a faculty member at the University of Tokyo, and was a Professor of the Department of Electronic Engineering from 1994. Currently he is professor at the Department of Information and Communication Engineering, Graduate School of Information Science and Technology, University of Tokyo. From March 1987 to January 1988, he was Visiting Scientist at the Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, U.S.A. He has been engaged in a wide range of research on spoken language processing, including analysis, synthesis, recognition, dialogue systems, and computer-assisted language learning. From 2000 to 2004, he was Principal Investigator of the national project “Realization of advanced spoken language information processing utilizing prosodic features,” supported by the Japanese Government. He served as Chair of Speech Committee, Institute of Electronics, Information and Communication Engineers (IEICE)/Acoustical Society of Japan (ASJ) from 2003 to 2005. He is Chair of Speech Prosody Special Interest Group (SPro-SIG), ISCA, from October 2010. He has been on the editorial board of Speech Communication journal since 2004 and on the editorial board of ETRI Journal since 2009. He is a Fellow of Institute of Information and Communication Engineering and a member of a number of academic societies, including IEEE, International Speech Communication Association (Board member), Acoustical Society of America, Acoustical Society of Japan, Information Processing Society of Japan, Japanese Society for Artificial Intelligence, and Research Institute of Signal Processing Japan (Board member).

Jianhua Tao received the M.S. degree from Nanjing University in 1996 and the Ph.D. in Computer Science from Tsinghua University in

Selects recent works on speech prosody written by world-wide eminent scholars; a “must read” book of the speech “prosodist”

Gives clear and total view on how prosody conveying linguistic and para-/non- linguistic information

Offers guidelines toward an ultimate goal of speech synthesis; realizing human-like quality and flexibility in speech synthesis

Includes supplementary material: sn.pub/extras

Imprimé à la demande
Relié

Date de parution : 10-2016

Ouvrage de 213 p.

15.5x23.5 cm

Disponible chez l'éditeur (délai d'approvisionnement : 15 jours).

116,04 €

Ajouter au panier

Date de parution : 03-2015

Ouvrage de 213 p.

15.5x23.5 cm

Disponible chez l'éditeur (délai d'approvisionnement : 15 jours).

116,04 €

Ajouter au panier

Thèmes de Speech Prosody in Speech Synthesis: Modeling and... :

Mots-clés :

Modeling of Prosody; Prosody Conversions; Prosody Generation; Selection-based Methods with Segments of Human Speech; Speech Synthesis

Speech Prosody in Speech Synthesis: Modeling and generation of prosody for high quality and flexible speech synthesis, 2015 Prosody, Phonology and Phonetics Series

Résumé

Sommaire

Biographie

Commentaire

Thèmes de Speech Prosody in Speech Synthesis: Modeling and... :

Mots-clés :

Ces ouvrages sont susceptibles de vous intéresser