Shogun Machine Learning Toolboxbusiness
Mailing List: mailto:email@example.com
- Improving accessibility to shogun by developing improving i/o support (more file formats) and mloss.org/mldata.org integration.
- Framework improvements (frameworks for regression, multiclass, structured output problems, quadratic progamming solvers).
- Integration of existing and new machine algorithms.
- Build Generic Structured Output Learning Framework The aim is to implement tools for structured output (SO) problems. The data in these problems have complex structure (e.g. graphs, sequences) and the traditional learning algorithms fail to find solutions efficiently. Structured output support vector machines and conditional random fields are methods for SO learning. They will be implemented to form Shogun's first module for SO learning. Finally, these methods will be applied to hidden Markov models-type of problems such as gene prediction.
- Built generic multiclass learning framework I'm a student with strong machine learning and open source programming experiences. I'm applying for a project that will implement a generic multiclass learning framework in shogun. While shogun is the state-of-the-art toolbox for binary classifiers, more multiclass methods need to be added to make it competitive in this area. Many real-world problems are naturally multiclass. So adding strong multiclass support for shogun would benefits a large community.
- Bundle method solver for structured output learning Learning of the structured output classifiers leads to solving a convex minimization problem which is not tractable by standard algorithms. A significant effort in ML community has been put to development of specialized solvers among which the Bundle Method for Risk Minimization (BMRM), implemented e.g. in popular StructSVM, is the current the state-of-the-art. The BMRM is a simplified variant of bundle methods which are standard tools for non-smooth optimization. The simplicity of the BMRM is compensated by its reduced efficiency. Experiments show that a careful implementation of the classical bundle methods perform significantly faster (speedup ~ 5-10) than their variants (like BMRM) adopted by ML community. The goal will be an OS library implementing the classical bundle method for the SO learning and its integration to Shogun.
- Implement multitask and domain adaptation algorithms Multitask learning is a modern approach to machine learning that learns a problem together with other related problems at the same time, using a shared representation. This approach often leads to a better model for the main task, because it allows the learner to use the commonality among the tasks. The proposed project is about implementing various multitask learning algorithms for the Shogun toolbox.
- Implementation of latent SVM Implementation of a general purpose latent SVM.
- Implementing Gaussian Process Regession in Shogun This project focuses on implementing Gaussian Process Regression with hyperparameter learning in Shogun. The goal is to make the implementation easily extendable and able to handle large datasets through sparse approximation.
- Kernel based two-sample and independence test Statistical tests for dependence or difference are an important tool in data-analysis. However, when data is high-dimensional or in non-numerical form (strings, graphs), classical methods fail. This project implements recently developed kernel-based generalizations of statistical tests, which overcome this issue. The kernel-two-sample test based on the Maximum-Mean-Discrepancy (MMD) tests whether two sets of samples are from the same or from different distributions. Related to the kernel-two-sample test is the Hilbert-Schmidt-Independence criterion (HSIC), which tests for statistical dependence between two sets of samples. Multiple tests based on the MMD and the HSIC are implemented along with a general framework for statistical tests in SHOGUN.
- Various usability improvements Shogun is a fairly large project, it requires not only the machine learning algorithms. Maintenance, improvement of the individual parts, integrating them into the interfaces are a important tasks. The proposed project is about various usability improvements for the Shogun toolbox.