GSoC/GCI Archive
Google Code-in 2012 Apertium

Extract Armenian adjective translations from Wiktionary

completed by: conor-f

mentors: Francis Tyers, Jonathan

Wiktionary has lots of translations for Armenian adjectives, for example consider the page:

http://en.wiktionary.org/wiki/%D5%A1%D5%A6%D5%B6%D5%AB%D5%BE

 

Adjective

 

ազնիվ (azniv)

 

  1. honest, honest-minded
  2. fair

 

...

 

The idea of this task is to extract these translations into lttoolbox XML format as follows:

 

<e c=""><p><l>ազնիվ<s n="adj"/></l><r>honest<s n="adj"/></r></p></e>
<e c=""><p><l>ազնիվ<s n="adj"/></l><r>honest-minded<s n="adj"/></r></p></e>
<e c=""><p><l>ազնիվ<s n="adj"/></l><r>fair<s n="adj"/><s n="sint"/></r></p></e>
<e c=""><p><l>ազնիվ<s n="adj"/></l><r>straightforward<s n="adj"/></r></p></e>
<e c=""><p><l>ազնիվ<s n="adj"/></l><r>upright<s n="adj"/></r></p></e>
<e c=""><p><l>ազնիվ<s n="adj"/></l><r>straight<s n="adj"/><s n="sint"/></r></p></e>
<e c=""><p><l>ազնիվ<s n="adj"/></l><r>square<s n="adj"/></r></p></e>
<e c=""><p><l>ազնիվ<s n="adj"/></l><r>decent<s n="adj"/></r></p></e>
<e c=""><p><l>ազնիվ<s n="adj"/></l><r>noble<s n="adj"/><s n="sint"/></r></p></e>


You will need to look out for:

 

 

* making sure that comments are put in the comment field

 

* ensuring that the '<sint>' tag is properly added  -- you can retrieve this information from the morphological analyser of English.

 

 

 

For further information about this task, join us on IRC: irc.freenode.net #apertium