GSoC/GCI Archive
Google Summer of Code 2011

LanguageTool

Web Page: http://languagetool.wikidot.com/missing-features

Mailing List: https://lists.sourceforge.net/lists/listinfo/languagetool-devel

LanguageTool is an open-source proofreading tool that supports traditional checks associated with grammar checkers but goes far beyond that to check style and frequent language misuse. It supports over 20 languages, and more are in development. LanguageTool is integrated into multiple environments, including OpenOffice.org, vim, OmegaT, and LyX.

The code repo for GSoC will be at http://code.google.com/p/google-summer-of-code-2011-languagetool/. Everything piece of code is also available at sourceforge.

Projects

  • Adding Rule Conversion and Language Detection Functionality to Language Tool This project will incorporate an open-source language detection library into Language Tool, so the user does not have to set the language manually. It will also develop a rule conversion module, with support for rules in both the Constraint Grammar and After the Deadline styles.
  • Lucene Based Fast Rule Evaluation for LanguageTool with Chinese Language Support I will develop a fast rule evaluation tool for LanguageTool. Lucene is used to index large corpus like Wikipedia with POS taggers, and to query fast on the rules. This will greatly improve the performance of new rule checking and increase the speed of new rule creation. I will also contribute the Chinese language pack support. Lessons on Chinese pattern rule creation learned from this project will benefit the development of other eastern languages in future.