SHOGUN is a machine learning toolbox, which is designed for unified large-scale learning for a broad range of feature types and learning settings. It offers a considerable number of machine learning models such as support vector machines for classification and regression, hidden Markov models, multiple kernel learning, linear discriminant analysis, linear programming machines, and perceptrons. Most of the specific algorithms are able to deal with several different data classes, including dense and sparse vectors and sequences using floating point or discrete data types. We have used this toolbox in several applications from computational biology, some of them coming with no less than 10 million training examples and others with 7 billion test examples. With more than a thousand installations worldwide, SHOGUN is already widely adopted in the machine learning community and beyond.
SHOGUN is implemented in C++ and interfaces to all important languages like MATLAB, R, Octave, Python, Lua, Java, C#, Ruby and has a stand-alone command line interface. The source code is freely available under the GNU General Public License, Version 3 at http://www.shogun-toolbox.org.
During Summer of Code 2012 we are looking to extend the library in three different ways:
- Improving accessibility to shogun by developing improving i/o support (more file formats) and mloss.org/mldata.org integration.
- Framework improvements (frameworks for regression, multiclass, structured output problems, quadratic progamming solvers).
- Integration of existing and new machine algorithms.
Here is listed a set of suggestions for projects.
Please use the scheme shown below for your student application. If you have any questions, ask on the mailing list (firstname.lastname@example.org, please note that you have to be subscribed in order to post).