Articulatory feature extraction from ultrasound images using pretrained convolutional neural networks

Kele Xu; Jian Zhu

doi:10.1121/1.5068358

ScienceGate Book Chapters

JOURNAL ARTICLE

Articulatory feature extraction from ultrasound images using pretrained convolutional neural networks

Kele Xu Jian Zhu

Year: 2018 Journal: The Journal of the Acoustical Society of America Vol: 144 (3_Supplement)Pages: 1907-1907 Publisher: Acoustical Society of America

DOI: 10.1121/1.5068358

Get Full-Text PDF Get Analytical Report

Abstract

Feature extraction is of great importance to ultrasound tongue image analysis. Inspired by the recent success of deep learning, we explore a novel approach to feature extraction from ultrasound tongue images using pre-trained convolutional neural networks (CNN). The bottleneck features from different pre-trained CNNs, including VGGNet and ResNet, are used as representations of the ultrasound tongue images. Then an image classification task is conducted to assess the effectiveness of CNN-based features. Our dataset consists of 20,000 ultrasound tongue images collected from a female speaker of Mandarin Chinese, which were manually labeled as containing one of the following consonants: /p, t, k, l/. Experiment results show that the Gradient Boost Machines (GBM) classifiers trained on the CNN-based features achieve the best performance, with a classification accuracy of 92.4% for ResNet and 91.6% for VGGNet, outperforming the benchmark GBM classifier trained on the features extracted using Principal Component Analysis (PCA), which only achieves an accuracy of 87.5%. In this preliminary dataset, our method of feature extraction is found to be superior to the PCA-based method. This work demonstrates the potential of applying the pre-trained convolutional neural networks to ultrasound tongue image analysis task.

Keywords:

Computer science Convolutional neural network Artificial intelligence Pattern recognition (psychology) Feature extraction Classifier (UML) Principal component analysis Feature (linguistics) Deep learning Bottleneck Speech recognition

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.12

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Traditional Chinese Medicine Studies

Health Sciences → Medicine → Complementary and alternative medicine

Cancer-related molecular mechanisms research

Life Sciences → Biochemistry, Genetics and Molecular Biology → Cancer Research

Articulatory feature extraction from ultrasound images using pretrained convolutional neural networks

Abstract

Metrics

Citation History

Topics

Related Documents

Articulatory Feature Classification Using Convolutional Neural Networks

Discriminative feature extraction from X-ray images using deep convolutional neural networks

Gender Classification from Eye Images by Using Pretrained Convolutional Neural Networks

Feature Extraction using Spiking Convolutional Neural Networks

Eye biometry prediction from ultrasound images using convolutional neural networks