CU4074 Speech Processing Syllabus:
CU4074 Speech Processing Syllabus – Anna University PG Syllabus Regulation 2021
COURSE OBJECTIVES:
To introduce speech production and related parameters of speech.
To illustrate the concepts of speech signal representations and coding.
To understand different speech modeling procedures such Markov and their implementation issues.
To gain knowledge about text analysis and speech synthesis.
UNIT I FUNDAMENTALS OF SPEECH PROCESSING
Introduction – Spoken Language Structure – Phonetics and Phonology – Syllables and Words – Syntax and Semantics – Probability, Statistics and Information Theory – Probability Theory – Estimation Theory – Significance Testing – Information Theory.
UNIT II SPEECH SIGNAL REPRESENTATIONS AND CODING
Overview of Digital Signal Processing – Speech Signal Representations – Short time Fourier Analysis – Acoustic Model of Speech Production – Linear Predictive Coding – Cepstral Processing – Formant Frequencies – The Role of Pitch – Speech Coding – LPC Coder, CELP, Vocoders.
UNIT III SPEECH RECOGNITION
Hidden Markov Models – Definition – Continuous and Discontinuous HMMs – Practical Issues – Limitations. Acoustic Modeling – Variability in the Speech Signal – Extracting Features – Phonetic Modeling – Adaptive Techniques – Confidence Measures – Other Techniques.
UNIT IV TEXT ANALYSIS
Lexicon – Document Structure Detection – Text Normalization – Linguistic Analysis – Homograph Disambiguation – Morphological Analysis – Letter-to-sound Conversion – Prosody – Generation schematic – Speaking Style – Symbolic Prosody – Duration Assignment – Pitch Generation
UNIT V SPEECH SYNTHESIS
Attributes – Formant Speech Synthesis – Concatenative Speech Synthesis – Prosodic Modification of Speech – Source-filter Models for Prosody Modification – Evaluation of TTS Systems.
COURSE OUTCOMES:
Upon completion of this course, the students will be able to:
CO1: Model speech production system and describe the fundamentals of speech.
CO2: Extract and compare different speech parameters.
CO3: Choose an appropriate statistical speech model for a given application.
CO4: Design a speech recognition system.
CO5: Use different text analysis and speech synthesis techniques.
TOTAL:45 PERIODS
REFERENCES
1. Ben Gold and Nelson Morgan, “Speech and Audio Signal Processing, Processing and Perception of Speech and Music”, Wiley- India Edition, 2006
2. Claudio Becchetti and Lucio Prina Ricotti, “Speech Recognition”, John Wiley and Sons, 1999.
3. Daniel Jurafsky and James H Martin, “Speech and Language Processing – An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition”, Pearson Education, 2002.
4. Frederick Jelinek, “Statistical Methods of Speech Recognition”, MIT Press, 1997.
5. Lawrence Rabiner and Biing-Hwang Juang, “Fundamentals of Speech Recognition”, Pearson Education, 2003.
6. Steven W. Smith, “The Scientist and Engineer‟s Guide to Digital Signal Processing”, California Technical Publishing, 1997.
7. Thomas F Quatieri, “Discrete-Time Speech Signal Processing – Principles and Practice”, Pearson Education, 2004.