This paper investigates the application of deep models including deep maxout networks(DMNs) to Mandarin tone recognition. Our focus is on the capacity of extracting high-level robust features and fusing different kinds of serially-concatenated features of deep models. Furthermore, Maxout networks have been proposed to integrate dropout naturally and achieve state-of-the-art results. Therefore, we investigate the advantage of DMNs when the training data is limited and imbalanced. Our experiments on the ASCCD corpus show that comparing with shallow models such as one-hidden layer multi-perception (MLP) and support vector machine(SVM), deep models improve Mandarin tone recognition significantly. Among the deep models, DMNs can get better performance comparing with other deep neural networks based on sigmoid units or rectified linear units(ReLU).
Li XuWenle ZhangNing ZhouChao‐Yang LeeYongxin LiXiuwu ChenXiaoyan Zhao
Jianwei NiuLei XieLei JiaNa Hu
Yiming SuChung‐Bow LeeKuo-Ching ChangChui-Liang Chiang