Train part-of-speech taggers for Dutch and Afrikaans
completed by: AureiAnimus
mentors: Francis Tyers
The aim of this task is to train part-of-speech taggers for Dutch and Afrikaans as part of the apertium-af-nl MT system. This will involve writing a TSX file for each of the languages. Then running the training process 'unsupervised' as described on the Wiki. As a corpus of Afrikaans use the Afrikaans Wikipedia, for Dutch, use the EuroParl corpus. You should also write 5--10 forbid/enforce rules for each tagger based on a brief survey of disambiguation errors.