GSoC/GCI Archive
Google Code-in 2014 Apertium

Figure out why begiak's url catcher still has some encoding issues and fix

completed by: Mikhail Ivchenko

mentors: Jonathan

Begiak (our IRC bot) gives the title of pages when a message in the channel contains a url. However, it sometimes has problems with certain encodings. Figure out why pages like Êàçàõñêèé ÿçûê. Ãðàììàòèêà. Äàâíî ïðîøåäøåå âðåìÿ (î÷åâèäíîå), Êàçàõñêèé ÿçûê. Ãðàììàòèêà. Ïîñëîâèöû - Ó÷åíèå, çíàíèå, Ãîñóäàðñòâåííîå ó÷ðåæäåíèå "Ðåäàêöèÿ ãàçåòû "ÊÀÐÀ×ÀÉ", and Türk Dili Konuşan Ülkeler İşbirliği Konseyi aren't being displayed properly, and fix begiak so that they are. Make sure other page titles keep being displayed properly. For this task, you should fork the bot on github and send a pull request when you're done.
For further information and guidance on this task, you are encouraged to come to our IRC channel.