Institutional Repository

Experiments with syllable-based English-Zulu alignment

Show simple item record Kotzé, Gideon Wolff, Friedel 2014-08-27T08:53:56Z 2014-08-27T08:53:56Z 2014-05
dc.identifier.citation Kotzé and Wolff, 2014 en
dc.description.abstract As a morphologically complex language, Zulu has notable challenges aligning with English. One of the biggest concerns for statistical machine translation is the fact that the morphological complexity leads to a large number of words for which there exist very few examples in a corpus. To address the problem, we set about establishing an experimental baseline for lexical alignment by naively dividing the Zulu text into syllables, resembling its morphemes. A small quantitative as well as a more thorough qualitative evaluation suggests that our approach has merit, although certain issues remain. Although we have not yet determined the effect of this approach on machine translation, our first experiments suggest that an aligned parallel corpus with reasonable alignment accuracy can be created for a language pair, one of which is under-resourced, in as little as a few days. Furthermore, since very little language-specific knowledge was required for this task, our approach can almost certainly be applied to other language pairs and perhaps for other tasks as well. en
dc.language.iso en en
dc.publisher European Language Resources Association (ELRA) en
dc.subject machine translation en
dc.subject morphology en
dc.subject bitext alignment en
dc.subject African languages en
dc.subject linguistics en
dc.subject computational linguistics en
dc.title Experiments with syllable-based English-Zulu alignment en
dc.type Article en

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search UnisaIR


My Account