CU4074 Speech Processing Syllabus:

CU4074 Speech Processing Syllabus – Anna University PG Syllabus Regulation 2021

COURSE OBJECTIVES:

 To introduce speech production and related parameters of speech.
 To illustrate the concepts of speech signal representations and coding.
 To understand different speech modeling procedures such Markov and their implementation issues.
 To gain knowledge about text analysis and speech synthesis.

UNIT I FUNDAMENTALS OF SPEECH PROCESSING

Introduction – Spoken Language Structure – Phonetics and Phonology – Syllables and Words – Syntax and Semantics – Probability, Statistics and Information Theory – Probability Theory – Estimation Theory – Significance Testing – Information Theory.

UNIT II SPEECH SIGNAL REPRESENTATIONS AND CODING

Overview of Digital Signal Processing – Speech Signal Representations – Short time Fourier Analysis – Acoustic Model of Speech Production – Linear Predictive Coding – Cepstral Processing – Formant Frequencies – The Role of Pitch – Speech Coding – LPC Coder, CELP, Vocoders.

UNIT III SPEECH RECOGNITION

Hidden Markov Models – Definition – Continuous and Discontinuous HMMs – Practical Issues – Limitations. Acoustic Modeling – Variability in the Speech Signal – Extracting Features – Phonetic Modeling – Adaptive Techniques – Confidence Measures – Other Techniques.

UNIT IV TEXT ANALYSIS

Lexicon – Document Structure Detection – Text Normalization – Linguistic Analysis – Homograph Disambiguation – Morphological Analysis – Letter-to-sound Conversion – Prosody – Generation schematic – Speaking Style – Symbolic Prosody – Duration Assignment – Pitch Generation

UNIT V SPEECH SYNTHESIS

Attributes – Formant Speech Synthesis – Concatenative Speech Synthesis – Prosodic Modification of Speech – Source-filter Models for Prosody Modification – Evaluation of TTS Systems.

COURSE OUTCOMES:

Upon completion of this course, the students will be able to:
CO1: Model speech production system and describe the fundamentals of speech.
CO2: Extract and compare different speech parameters.
CO3: Choose an appropriate statistical speech model for a given application.
CO4: Design a speech recognition system.
CO5: Use different text analysis and speech synthesis techniques.

TOTAL:45 PERIODS

REFERENCES

1. Ben Gold and Nelson Morgan, “Speech and Audio Signal Processing, Processing and Perception of Speech and Music”, Wiley- India Edition, 2006
2. Claudio Becchetti and Lucio Prina Ricotti, “Speech Recognition”, John Wiley and Sons, 1999.
3. Daniel Jurafsky and James H Martin, “Speech and Language Processing – An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition”, Pearson Education, 2002.
4. Frederick Jelinek, “Statistical Methods of Speech Recognition”, MIT Press, 1997.
5. Lawrence Rabiner and Biing-Hwang Juang, “Fundamentals of Speech Recognition”, Pearson Education, 2003.
6. Steven W. Smith, “The Scientist and Engineer‟s Guide to Digital Signal Processing”, California Technical Publishing, 1997.
7. Thomas F Quatieri, “Discrete-Time Speech Signal Processing – Principles and Practice”, Pearson Education, 2004.