GSoC/GCI Archive
Google Summer of Code 2010

Open Bioinformatics Foundation

Web Page:

Mailing List:

The OBF is a nonprofit volunteer-run organization focused on supporting open source programming in bioinformatics. It acts as the umbrella organization for the BioPerl, BioPython, BioJava, BioRuby, BioSQL, and BioLib projects.


  • Application on (BioPerl) Alignment Subsystem Refactoring Bio::Align::AlignI and Bio::Assembly were developed during the early stage of BioPerl, which deal with sequence alignment/assembly files. However, several problems were reported concerning these two packages, especially from the growing number of BioPerl packages and huge amount of sequencing data generated in the next generation sequencing projects. In this project, we aim to refactor the package, to increase the compability and working efficiency of the packages.
  • BioJava Packages for Identification, Classification, and Visualization of Posttranslational Modification of Proteins In this project, three BioJava packages will be developed in order to identify PTMs in 3D protein structures, to generate sequence diagrams with PTM annotations, and to generate 2D tree images of carbohydrate (glycan) structures, respectively. These packages will allow for better annotations of PTMs on 3D structures and hence facilitate structural studies of PTMs.
  • Extending Bio.PDB: broadening the usefulness of Biopython's Structural Biology module Biopython is a widely used Bioinformatics library. It provides the structural biologist with the Bio.PDB module which provides several useful functions. However, there are several simple biologically relevant questions that Bio.PDB still cannot answer. This project aims to extend Bio.PDB with methods (e.g. S-S bridge probing, Polar Hydrogen Addition) that will make it even more attractive for the biological community.
  • Implementing Speciation & Duplication Inference Algorithm for Binary and Non-binary Species tree Implementing a speciation vs. duplication inference algorithm as an extension to BioRuby would make use of existing code infrastructure and also make the tool readily available to a community of phylogenetics researchers. Project Goals: 1) Implement Zmasek and Eddy's speciation/duplication inference algorithm for binary trees as an extension to BioRuby. 2) Extend the above speciation/duplication inference algorithm to support non-binary species trees as in Vernot et al.
  • Improvements to BioJava including Implementation of Multiple Sequence Alignment Algorithms Biologists infer evolutionary, structural, and functional relationships between biopolymers from similarities and divergences of primary structure in multiple sequence alignments. I plan to code a module for BioJava which manages an alignment and offers several implementation options. Addition of this framework to BioJava would assist Java-based bioinformatics tools as a reference version of current techniques and as a foundation on which to research variations of multiple sequence alignment.