Audio-Visual Speech Recognition based on Machine Learning approach

Pinki Roy; Saswati Debnath

doi:10.1504/ijaip.2018.10039010

ScienceGate Book Chapters

JOURNAL ARTICLE

Audio-Visual Speech Recognition based on Machine Learning approach

Pinki Roy Saswati Debnath

Year: 2018 Journal: International Journal of Advanced Intelligence Paradigms Vol: 1 (1)Pages: 1-1 Publisher: Inderscience Publishers

DOI: 10.1504/ijaip.2018.10039010

Get Full-Text PDF Get Analytical Report

Abstract

Audio-visual speech recognition by machine plays an important role when research in automatic speech recognition reaches its highest performance. Audio alone also gives good performance, but adding the visual information potentially gives more convenient recognition system when an audio signal degrades in a noisy environment and may vary because of the environmental channel. This paper proposes an audio-visual automatic speech recognition (AV-ASR) system based on machine learning approaches. Visual information is captured from lip contour. Pseudo Zernike moments (PZMs) and 19th order Mel frequency cepstral coefficients (MFCCs) are extracted to obtain visual information and audio feature respectively. Machine learning approach, artificial neural networks (ANN) and support vector machines (SVM) are used to recognise speech for audio and visual modality. After the individual recognition of two systems, a combined decision is taken. This paper also evaluates the individual performance of both audio and visual speech recognition by machine learning approach.

Keywords:

Computer science Speech recognition Artificial intelligence Support vector machine Audio visual Audio mining Modality (human–computer interaction) Feature (linguistics) Visualization Audio signal Mel-frequency cepstrum Pattern recognition (psychology) Feature extraction Machine learning Voice activity detection Speech processing Speech coding Multimedia

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.29

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Audio-Visual Speech Recognition based on Machine Learning approach

Abstract

Metrics

Topics

Related Documents

Audio-visual speech recognition based on machine learning approach

Open-Domain Audio-Visual Speech Recognition: A Deep Learning Approach

Multimodal Learning of Audio-Visual Speech Recognition with Liquid State Machine

Machine Learning Approach of Audio-Visual Based Emotion Recognition: A Comparative Analysis

A Combined Rule-Based & Machine Learning Audio-Visual Emotion Recognition Approach