DS4102 Speech and Audio Signal Processing Syllabus:

DS4102 Speech and Audio Signal Processing Syllabus – Anna University PG Syllabus Regulation 2021

COURSE OBJECTIVES:

 To analyze the speech signal in the time and frequency domain
 To understand the characteristics of Speech and Audio
 To carry out LPC based characterization
 To understand the applications of Filter banks in speech analysis
 To understand different applications of speech and audio signals

UNIT I MECHANICS OF SPEECH AND AUDIO

Speech production mechanism – Nature of Speech signal – Digital Model of speech signals – Classification of Speech sounds – Phones – Phonemes – Phonetic and Phonemic alphabets – Articulatory features-Anatomical pathways from the ear to perception of sound – The peripheral auditory system. Absolute Threshold of Hearing – Critical Bands- Simultaneous Masking, Masking Asymmetry, Perceptual Entropy -Basic measuring philosophy – Subjective versus objective perceptual testing – The perceptual audio quality measure(PAQM).

UNIT II TIME AND FREQUENCY DOMAIN METHODS FOR SPEECH PROCESSING

Time domain parameters of Speech signal – Methods for extracting the parameters: Energy, Average Magnitude –Zero Crossing Rate (ZCR)– Silence Discrimination using ZCR and energy – Short Time Fourier analysis – Formant extraction and Pitch Extraction.

UNIT III LINEAR PREDICTIVE ANALYSIS OF SPEECH

Formulation of Linear Prediction problem in Time Domain – Basic Principle – Auto correlation method – Covariance method – Solution of LPC equations – Cholesky method – Durbin’s Recursive algorithm – lattice formation and solutions – Comparison of different methods – Application of LPC parameters – Pitch detection using LPC parameters – Formant analysis – VELP – CELP.

UNIT IV TIME-FREQUENCY ANALYSIS FOR AUDIO: FILTER BANKS AND TRANSFORMS

Analysis- Synthesis Framework for M-band Filter Banks- Filter Banks for Audio Coding: Design Considerations- Quadrature Mirror and Conjugate Quadrature Filters- Tree-Structured QMF- Cosine Modulated “Pseudo QMF” M-band Banks – Cosine Modulated Perfect Reconstruction (PR) M-band Banks and Modified Discrete Cosine Transform (MDCT).

UNIT V SPEECH AND AUDIO SIGNAL PROCESSING ALGORITHMS

Algorithms: Dynamic Time Warping, Hidden Markov Model– Gaussian Mixture Model – Automatic Speech Recognition – Feature Extraction for ASR – Speaker identification and verification – Voice response system – Speech Synthesis -Digital Audio Watermarking – Audio MPEG 4.

SUGGESTED ACTIVITIES:

1. Design digital model for speech signals
2. Perform time-frequency analysis of speech signals
3. Simulation of LPC Algorithms
4. Design and Develop filter banks for audio signals
5. Create program for speech recognition that suits real- world applications

COURSE OUTCOMES:

On Successful completion, students will be able to
CO1: Characterize Speech and audio signal production and perception mechanisms.
CO2: Analyze speech and audio signals in the time and frequency domains.
CO3: Design a LPC coder
CO4: Develop speech processing solutions based on filter banks
CO5: Design speech recognition, speaker identification and speech synthesis schemes.

TOTAL:45 PERIODS

REFERENCES:

1. L.R.Rabiner and R.W.Schaffer, “Digital Processing of Speech signals”, Pearson Education Singapore Pvt. Ltd, First Edition,2008.
2. Ben Gold and Nelson Morgan, “Speech and Audio Signal Processing”, John Wiley and Sons Inc., Singapore,Second Edition, 2011.
3. Quatieri, “Discrete-time Speech Signal Processing”, Pearsm Education, First Edition, 2002.
4. UdoZölzer “A John, “Digital Audio Signal Processing”, Wiley & sons Ltd Publications, Second Edition, 2008.
5. Mark Kahrs and Karlheinz Brandenburg, “Applications of Digital Signal Processing to Audio And Acoustics”,Springer Publishing Company, Incorporated, 2013.
6. Ken C. Pohlmann, “Principles of Digital Audio”, McGraw Hill, New Delhi, Sixth Edition, 2010.
7. John Watkinson, “An Introduction to Digital Audio”, Focal Press, Second Edition, 2002.
8. SpaniasAndress, Painter Ted @ AttiVentataraman, “Audio Signal Processing and Coding”, John Wiley &Sons, New Delhi, 2013.