Multilingual training of acoustic models in automatic speech recognition

Nieuwoudt, C; Botha, EC

UnisaIR Home
→
College of Science, Engineering and Technology
→
School of Computing
→
SAICSIT Digital Archive
→
South African Computer Journal (SACJ)
→
South African Computer Journal 2000(25)
→
View Item

dc.contributor.author	Nieuwoudt, C
dc.contributor.author	Botha, EC
dc.date.accessioned	2018-06-15T07:41:18Z
dc.date.available	2018-06-15T07:41:18Z
dc.date.created	2000
dc.date.issued	2000
dc.identifier.citation	Nieuwoudt C & Botha EC (2000) Multilingual training of acoustic models in automatic speech recognition. South African Computer Journal, Number 25, 2000	en
dc.identifier.issn	2313-7835
dc.identifier.uri	http://hdl.handle.net/10500/24394
dc.description.abstract	This paper evaluates the performance of a speech recognition system using acoustic models trained on multilingual data. The reason in our case for using data from more than one language is that there may not be enough data available for a new language to train a robust recogniser. Two general strategies are employed: firstly, the pooling of data from the different languages for training and, secondly, the training of models on the data from one language and subsequent adaptation of the models using data from the new target language. For the first approach, English data and Afrikaans training data are pooled in order to train hidden Markov models (HMMs) for the target language, Afrikaans. For the second approach, the parameters of HMMs trained on English data are adapted using maximum a posteriori probability (MAP) and maximum likelihood linear regression (MUR) methods on Afrikaans data. Continuous density HMMs are used to model context independent phones found in Afrikaans. Cross-language adaptation performance is evaluated in terms of phone recognition performance as well as,for a continuous speech recognition task in Afrikaans. The interesting result is that,for continuous recognition the best performance is obtained by simple pooling of the data and this performance far exceeds the performance achievable using only data from the target language. The improvement is due to the fact that in our database there exists no mismatch between the English and Afrikaans data (other than the language difference) and both languages were labelled with a consistent set of labels. Adaptation results indicate that both MAP adaptation and MUR transformation of English models using Afrikaans adaptation data significantly improves model performance and also achieves better performance than achievable by direct training on the adaptation data.	en
dc.language.iso	en	en
dc.publisher	South African Computer Society (SAICSIT)	en
dc.subject	Adaptation	en
dc.subject	Multilingual	en
dc.subject	Speech recognition	en
dc.title	Multilingual training of acoustic models in automatic speech recognition	en
dc.type	Article	en