Shweta SinhaShyam Sunder AgrawalAruna Jain
Abstract State of the art automatic speech recognition system uses Mel frequency cepstral coefficients as feature extractor along with Gaussian mixture model for acoustic modeling but there is no standard value to assign number of mixture component in speech recognition process.Current choice of mixture component is arbitrary with little justification. Also the standard set for European languages can not be used in Hindi speech recognition due to mismatch in database size of the languages.Parameter estimation with too many or few component may inappropriately estimate the mixture model. Therefore, number of mixture is important for initial estimation of expectation maximization process. In this research work, the authors estimate number of Gaussian mixture component for Hindi database based upon the size of vocabulary.Mel frequency cepstral feature and perceptual linear predictive feature along with its extended variations with delta-delta-delta feature have been used to evaluate this number based on optimal recognition score of the system . Comparitive analysis of recognition performance for both the feature extraction methods on medium size Hindi database is also presented in this paper.HLDA has been used as feature reduction technique and also its impact on the recognition score has been highlighted.
Shweta SinhaSanyam AgrawalAruna Jain
Shweta SinhaS. S. AgrawalAruna Jain
Shobha BhattAmita DevAnurag Jain
Chin‐Hui LeeLawrence R. RabinerRoberto Pieraccini
Michael CohenHoracio FrancoNelson MorganDavid E. RumelhartVictor Abrash