GSoC/GCI Archive
Google Summer of Code 2012 R project for statistical computing

Biganalysis: A robust, general-purpose R package for large scale classification

by Xiaolin Yang for R project for statistical computing

We propose to develop a robust, general-purpose R package for large scale classification. Our aim is to implement several novel and reliable rank-based classification and feature selection methods including linear discriminant comparison analysis (LDCA); Pairwise comparison based classification and regression trees (TSP-CART); the TSP-CART based random forest, and the TSP-CART based gradient boosting algorithm. We will also consider the structured versions of these algorithms, which allow us to easily incorporate prior structural information into data analysis. The targeted application of this package includes large-scale scientific data analysis, marketing data analysis, and web data analysis.