Gaussian Processes for Classification
Short description: Gaussian Processes provide probabilistic approach to supervised machine learning. SHOGUN Toolbox has already implemented flexible Gaussian Processes (GP) framework for regression. This project is about extending existing GP framework for classification.
Personal informationName: Roman Votyakov
Short bio / background:
I have successfully completed courses such as linear algebra, probability theory and statistics, numerical analysis, single/multivariable calculus in university and ML course from Andrew Ng at Coursera.
I have solid background in optimization, algorithms and data structures. I participated in many algorithm competitions.
I have experience in computational neuroscience and neural networks. I have completed project about handwritten digits recognition with multilayer perceptron and flow shop scheduling problem with Hopfield neural network.
I also have a good understanding Gaussian Process basics and Bayesian Inference approaches for Machine Learning.
Email: firstname.lastname@example.org (I have subscribed to mailing list)
Hardware specifications and operating system:
Primary: PC with Intel(R) Core(TM) i7 CPU 860 and 4GB RAM / Ubuntu 13.04
Optional: Laptop Asus K52DE with AMD Turion(tm) II P520 Dual-Core Processor and 3GB RAM / Ubuntu 12.04
Development environment (IDE, git, required libraries, etc.) has been successfully installed on both devices.
- c++ - 4 (Various projects of varying difficulty)
- python - 2 (Some personal scripts)
- octave/matlab - 3 (Various numerical and machine learning algorithms)
- Shogun, Eigen3, GPML Toolbox - 1-2 (Some experience, while writing patches for SHOGUN and doing experiments)
Machine Leaning experience:
I worked with classifiers such as logistic regression, SVMs, Naive Bayes and multilayer perceptron. Also worked on solving flow shop scheduling problem with Hopfield neural network. In university I'm working on Hierarchical Temporal Memory (HTM), which combines different approaches from machine learning (Bayesian networks, Neural networks, Deep Learning) and neuroscience.
Me and SHOGUN
I was involved in development in the SHOGUN Toolbox project and made some contributions.
I fixed some bugs:
- Recomputing of Cholesky factor in ExactInferenceMethod class
- Segfaults in GP examples
- FITC inference method wrong posterior results
- Negative values for the variance in GP regression
Wrote some unit-tests and test case:
- Unit-test for Laplacian inference method
- Unit-test for FITC inference method
- Test case for GP regression unit-test
And made some improvements in GP framework architecture:
Degrees of freedom (m_df) field moved from CLikelihoodModel to CStudentsTLikelihood. So now we can use Student's-T likelihood with different values of degrees of freedom.
So you can see my contributions on GitHub (username - votjakovr).
I have no experience in developing OpenSource projects before. But I think, that it is not a problem, because I'm familiar with SHOGUN development workflow and have a good knowledge of version control system.
I would like to implement Gaussian Processes for Classification, finish basic GP framework in SHOGUN Toolbox during GSoC 2013 and possibly improve it, adding more advanced features in future.
I think I can successfully finish this project, because I have good knowledge of linear algebra, probability theory, statistics and familiar with existing GP framework for regression and theoretical foundation of Gussian Processes regression from . And also I have experience in LaTeX, which is required for writing documentation.
Gaussian Processes for Classification project (Proposal)
I choose this project, because I'm interesting in probabilistic approaches for machine learning (in particular Gaussian Processes).
I enjoy working in team, sharing knowledge and encouraging development of others to achieve specific team goals. Also I feel enthusiastic about resolving problems and I have the ability to effectively meet challenges, and have the flexibility and skills necessary to handle a challenging job.
I can work full (40h/week) time on this project during GSoC and approximately 15-20h/week before and after.
- Improving and debugging existing GP framework for regression, writing examples and tests. Discussing proposal and architecture of the project - 3-4 weeks (*)
- Implementing binary classification based on the classical logit-based likelihood and other likelihoods (error function, uniform) - 1-2 weeks
- Extension of existing Laplace inference for the binary classification - 1 week
- Implementing the expectation propagation method - 2 weeks
- Writing integration and unit-tests for GP framework for classification - 1 week
- Extension of GP framework for multiclass classification and writing integration and unit-tests for it - 1 week
- Extension of existing methods for model-selection for the classification context - 1 week
- Writing interactive graphical demo for the full GP framework - 1 week
- Writing documentation for the full GP framework - 1-2 weeks
I plan to spend time marked with (*) on the appropriate subtask before GSoC 2013 starts.
I also plan to write blog about my progress on Gaussian Processes for Classification project and SHOGUN experience.
 Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian Processes
in Machine Learning. MIT Press.
A special thanks to SHOGUN mentors (in particular Heiko Strathmann and Sergey Lisitsyn) for the help provided during the preparation of this proposal.