Speech Synthesis Literature
Books
-
Allen J., Hunnicutt S., Klatt D. (1987). From Text to Speech: The MITalk
System. Cambridge University Press, Inc.
-
Cawley G. (1996). The
Application of Neural Networks to Phonetic Modelling. PhD. Thesis,
University of Essex, England.
-
Donovan R. (1996). Trainable
Speech Synthesis. PhD. Thesis. Cambridge University Engineering Department,
England.
-
Flanagan J. (1972). Speech Analysis, Synthesis, and Perception. Springer-Verlag,
Berlin-Heidelberg-New York.
-
Flanagan J., Rabiner L. (Editors) (1973). Speech Synthesis. Dowden, Hutchinson
& Ross, Inc., Pennsylvania.
-
Kleijn K., Paliwal K. (Editors) (1998). Speech Coding and Synthesis. Elsevier
Science B.V., The Netherlands.
-
Karjalainen M. (1978). An Approach to Hierarchical Information Process
With an Application to Speech Synthesis by Rule. Doctorial Thesis. Tampere
University of Technology, Finland.
-
Kleijn K., Paliwal K. (Editors) (1998). Speech Coding and Synthesis. Elsevier
Science B.V., The Netherlands.
-
Laine U. K. (1989). Studies on Modeling of Vocal Tract Acoustics with Applications
to Speech Synthesis. Doctorial Thesis. Acoustics Laboratory Report Series
No 32, Helsinki University of Technology, Finland.
-
Lemmetty S. (1999). Review
of Speech Synthesis Technology. MSc. Thesis. Laboratory of Acoustics
and Audio Signal Processing, Helsinki University of Technology, Finland.
-
Rabiner L., Shafer R. (1978). Digital Processing of Speech Signals, Prentice-Hall.
-
Santen J., Sproat R., Olive J., Hirschberg J. (editors) (1997). Progress
in Speech Synthesis, Springer-Verlag New York Inc.
-
Witten I. (1982). Principles of Computer Speech, Academic Press Inc.
Some Selected Articles
-
Breen A. (1992). Speech Synthesis Models: A Review. Electronics & Communication
Engineering Journal, vol. 4: 19-31.
-
Dutoit T., Pagel V., Pierret N., Bataille F., Vrecken O. (1996). The MBROLA
Project: Towards a Set of High Quality Speech Synthesizers Free of Use
for Non Commercial Purposes. Proceedings of ICSLP 96 (3).
-
Goldstein M. (1995). Classification of Methods Used for Assessment of Text-to-Speech
Systems According to the Demands Placed on the Listener. Speech Communication
vol. 16: 225-244.
-
Klatt D. (1980). Software for a Cascade/Parallel Formant Synthesizer. Journal
of the Acoustical Society of America, JASA, Vol. 67: 971-995.
-
Klatt D. (1987) Review of Text-to-Speech Conversion for English. Journal
of the Acoustical Society of America, JASA vol. 82 (3): 737-793.
-
Klatt D., Klatt L. (1990). Analysis, Synthesis, and Perception of Voice
Quality Variations Among Female and Male Listeners. Journal of the Acoustical
Society of America, JASA vol. 87 (2): 820-857.
-
Kraft V., Portele T. (1995). Quality Evaluation of Five German Speech Synthesis
Systems. Acta Acustica 3 (1995): 351-365.
-
Logan J., Greene B., Pisoni D. (1989). Segmental Intelligibility of Synthetic
Speech Produced by Rule. Journal of the Acoustical Society of America,
JASA vol. 86 (2): 566-581.
-
Moulines E., Laroche J. (1995). Non-Parametric Techniques for Pitch-Scale
Modification of Speech. Speech Communication 16 (1995): 175-205.
-
Murray I., Arnott L. (1993). Toward the Simulation of Emotions in Synthetic
Speech: A Review of the Literature on Human Vocal Emotion. Journal of the
Acoustical Society of America, JASA vol. 93 (2): 1097-1108.
-
Rahim M., Goodyear C., Kleijn B., Schroeter J., Sondhi M. (1993). On the
Use of Neural Networks in Articulatory Speech Synthesis. Journal of the
Acoustical Society of America, JASA vol. 93 (2): 1109-1121.
-
Sagisaga Y. (1990). Speech Synthesis from Text. IEEE Communications Magazine,
vol. 28 1, pp. 35-41, 55.
-
Schroeder M. (1993). A Brief History of Synthetic Speech. Speech Communication
vol. 13, pp. 231-237.
-
Taylor P., Isard A. (1997). SSML: A Speech Synthesis Markup Language. Speech
Communication vol. 21: 123-133.
Some Selected Conference Papers (under construction) 
-
Dutoit T., Pagel V., Pierret N., Bataille F., Vrecken
O. (1996). The MBROLA Project: Towards a Set of High Quality Speech Synthesizers
Free of Use for Non Commercial Purposes. Proceedings of ICSLP 96 (3).
-
Beskow J. (1996). Talking Heads - Communication,
Articulation and animation. Proceedings of Fonetik-96: 53-56.
-
Beskow J., Dahlquist M., Granström B., Lundeberg
M., Spens K-E.,. Öhman T. (1997). The Teleface Project - Disability,
Feasibility, and Intelligibility. Proceedings of Fonetik97, Swedish Fonetics
Conf., Umea, Sweden. <http://www.speech.kth.se/~magnusl/teleface_f97.html>
-
Beskow K., Elenius K., McGlashan S. (1997). The OLGA
Project: An Animated Talking Agent in a Dialogue System. Proceedings of
Eurospeech 97. <http://www.speech.kth.se/multimodal/papers/>
-
Gaved M. (1993). Pronunciation and Text Normalisation
in Applied Text-to-Speech Systems. Proceedings of Eurospeech 93 (2): 897-900.
-
Hess W. (1992). Speech Synthesis - A Solved Problem?
Proceedings of EUSIPCO 92 (1): 37-46.
-
Heuft B., Portele T., Rauth M. (1996). Emotions in
Time Domain Synthesis. Proceedings of ICSLP 96 (3).
-
Jekosch U. (1992). The Cluster-Identification Test.
Proceedings of ICSLP 92 (1): 205-208.
-
Jekosch U. (1993). Speech Quality Assessment and
Evaluation. Proceedings of Eurospeech 93 (2): 1387-1394.
-
Karjalainen M., Altosaar T. (1991). Phoneme Duration
Rules for Speech Synthesis by Neural Networks. Proceedings of Eurospeech
91 (2): 633-636.
-
Karjalainen M., Altosaar T., Vainio M. (1998). Speech
Synthesis Using Warped Linear Prediction and Neural Networks. Proceedings
of ICASSP 98.
-
Klatt D. (1982). The Klattalk Text-to-Speech Conversion
System. Proceedings of ICASSP 82 (3): 1589-1592.
-
Laine U. (1982). PARCAS, a New Terminal Analog Model
for Speech Synthesis. Proceedings of ICASSP 82 (2).
-
Laine U., Karjalainen M., Altosaar T. (1994). Warped
Linear Prediction (WLP) in Speech Synthesis and Audio Processing. Proceedings
of ICASSP94 (3): 349-352.
-
Le Goff B., Benoit C. (1996). A Text-to-Audiovisual-Speech
Synthesizer for French. Proceedings of ICSLP96.
-
Lehtinen L., Karjalainen M. (1989). Individual Sounding
Speech Synthesis by Rule Using the Microphonemic Method. Proceedings of
Eurospeech 89 (2): 180-183.
-
Macon M., Clements C. (1996). Speech Concatenation
and Synthesis Using an Overlap-Add Sinusoidal Model. Proceedings of ICASSP
96: 361-364.
-
Macon M., Jensen-Link L., Oliverio J., Clements M.,
George E. (1997). A Singing Voice Synthesis System Based on Sinusoidal
Modeling. Proceedings of ICASSP97.
-
Möbius B., Sproat R., Santen J., Olive J. (1997).
The Bell Labs German Text-to-Speech System: An Overview. Proceedings of
the European Conference on Speech Communication and Technology vol. 5:
2443-2446.
-
Pols L. (1994). Voice Quality of Synthetic Speech:
Representation and Evaluation. Proceedings of ICSLP 94 (3): 1443-1446.
-
Sproat R., Taylor P., Tanenblatt M., Isard A. (1997).
A Markup Language for Text-to-Speech Synthesis. Proceedings of Eurospeech
97.
-
Tatham M., Lewis E. (1996). Improving Text-to-Speech
Synthesis. Proceedings of ICSLP 96 (3).
-
Valbret H., Moulines E., Tubach J. (1991). Voice
Transformation Using PSOLA Techique. Proceedings of Eurospeech 91 (1):
345-348.
If you want your publication in here or if you have any comments, feel
free to send some e-mail.
Speech synthesis
links.
My homepage.
My personal homepage (in
Finnish).
Last update: 29.5.1999 <sami.lemmetty@hut.fi>