MU4006 Voice Technologies Syllabus:

MU4006 Voice Technologies Syllabus – Anna University PG Syllabus Regulation 2021

COURSE OBJECTIVES:

 To Explain the speech recognition system and Automatic Speech Recognition
 To Extract features from the voice signal
 To apply difference voice classification techniques in the application
 To build virtual personal assistant using speech recognition techniques
 To perform voice synthesis techniques

UNIT I SPEECH PROCESSING AND RECOGNITION SYSTEM

Fundamental of Speech Recognition – Speech production process – Representation of voice in time and frequency domain – Speech sounds and features – Phonetics and Phonology – Phonetics – Phonology and Linguistics Suprasegmental Features of Speech – Automatic speech recognition system (ASR) – Structure of ASR – Neural network approach – Pronunciation Model – Language Model – Central Decoder.

UNIT II FEATURE EXTRACTION

Basic Audio Features – Pitch – Timbral Features – Rhythmic Features – Inharmonicity – Autocorrelation – MPEG-7 Features – Feature Extraction Techniques – Linear Prediction Coding (LPC) – Mel-Frequency Cepstral Coefficient (MFCC) – Perceptual Linear Prediction (PLP) – Discrete Wavelet Transform (DWT)

UNIT III VOICE CLASSIFICATION

Introduction – Classification Strategies – k-Nearest Neighbors (k-NN) – Naïve Bayes (NB) Classifier – Decision Tree and Speech Classification – Support Vector Machine (SVM) and Speech Classification – Neural Network in Speech Classification – Deep Neural Network in Speech Recognition and Classification

UNIT IV BUILDING VIRTUAL PERSONAL ASSISTANT (VPA)

Voice Recognition Module in Python – Building blocks of VPA – Build and Set a timer – Build and Alarm clock – Create Music module –News module – Live radio module – Text to Speech Module – Use cases – Google and Google Assistant – Apple & SIRI – Microsoft & CORTANA

UNIT V VOICE SYNTHESIS

Analogue signals – digital signals – filtering – source – filter model of speech production – Text analysis – dynamic time warping – hidden Markov models – statistical TTS paradigms

COURSE OUTCOMES:

CO1 Explain the speech recognition elements and apply Automatic Speech recognition
CO2 Apply the feature Extraction techniques to extract the of voice signal
CO3 Apply the voice classification techniques using different classification algorithm
CO4 Build virtual personal assistant and analyze the use cases
CO5 Perform voice synthesis using hidden Markov model

TOTAL PERIODS:45

REFERENCES

1. Lawrence Rabiner, Lawrence R. Rabiner, Biing-Hwang Juang, “Fundamentals of Speech Recognition”, Prentice Hall,1993
2. Jinyu Li, Li Deng, Reinhold Haeb-Umbach, and Yifan Gong, “Robust automatic speech recognition a bridge to practical applications , Academic Press, 2016