GSoC/GCI Archive
Google Summer of Code 2012

Python Software Foundation

Web Page: http://wiki.python.org/moin/SummerOfCode/2012

Mailing List: http://mail.python.org/mailman/listinfo/soc2012-general

The mission of the Python Software Foundation is to promote, protect, and advance the Python programming language, and to support and facilitate the growth of a diverse and international community of Python programmers.

For Summer of Code the Python Software Foundation serves as an umbrella organization to a number of community-based Python projects.  Browse our ideas page for links to these projects and information on how to apply.

Projects

  • A historical time-line for Tryton client Development of a widget to display in a friendly way historical time-line data fields stored on the server Tryton, filtered by date and time they were submitted by users.
  • Add a calendar view in the GTK client The project goal is to add a calendar view to Tryton in which some records will be displayed. Furthermore this calendar will allow the user to directly modify records date/time/duration as well as its content.
  • Autogenerate Colander schema from SQLAlchemy metadata Implement autogenerated Colander[0] schema from SQLAlchemy[1] metadata. Main use case is to easily generate forms out of sqlalchemy models keeping in mind the DRY principle. Admin interface for SQL database may not need customization, but in a end-user application developer may need to change any of attributes reflected from sqlalchemy model. Colander design makes it easy to provide such functionality. Needless to say, I will do test-driven development, deliver 100% test coverage and documentation.
  • Create a new version of the Pickle serializer for Python The purpose of this project is to improve Pickle for Python3 by adding 64bit compatibility, hard-coding the pickling and unpickling of sets et. al, a more compact pickling of small strings etc. This has been proposed on PEP 3154.
  • Cython Pxd generation using gcc-python-plugin Cython uses pxd files like headers for declarations of types and functions so cython can generate the necessary c-wrapper code such that we can call into other libraries. Creating pxd files for large libraries is very time consuming and is all manual work. Having a tool to parse over the necessary headers to find declarations aromatically would be a great help to the community.
  • Easy networking in PyGame Basing on idea from PyGame GSoC 2012 list, I would like to propose a creation of easy networking module for PyGame. Basic idea is to write something similar to PodSixNet library, but better integrated with PyGame.
  • Empirical Likelihood in Statsmodels In 1990, Art Owen published “Empirical Likelihood Ratio Confidence Regions” in The Annals of Statistics, which ignited a fury of research exploring the techniques and possibilities of empirical likelihood estimation. In 2009, statsmodels was released as its own package for statistical computing in the Python language and has subsequently grown to include among others, linear, nonlinear and time series regression models. In a way, empirical likelihood estimation and statsmodels are similar; they are both relatively new in their respective fields but are packed with unexploited opportunities that can benefit researchers, financial analysts and policymakers alike. That is why I propose implementing empirical likelihood estimation in statsmodels for my Google Summer of Code 2012 project.
  • Fast Numerical Computing with Cython This project proposes to support fast array expressions for Cython, through efficient elementwise traversal which maximizes cache re-usal and appeals to auto-vectorization on the CPU, as well as provide optional OpenCL code specializations for GPU execution. Partial reductions and elementwise user functions will be supported in array expressions, and boolean index assignment (and possibly evaluation) will be implemented. Optionally, we propose enhancements to the current parallel support to allow OpenCL as a backend. This project will also investigate code re-use with similar project like Theano, NumExpr and Numba.
  • GNU Mailman: Improving archives by extending HyperKitty My project will focus on implementing new features in HyperKitty archiver.
  • GNU Mailman: Metrics Mailman doesn't have any built-in statistical capabilities. The idea is to build list metrics and report them via a pluggable dashboard.
  • GSoC proposal for Twisted - Automatic Coding Standard Enforcement Twisted applies certain naming and style standards to all contributed code. Currently, a human reviewer needs to check all of these things. The purpose of this projects is to develop a tool which can automatically make these simple, mechanical checks, freeing up human reviewer time to focus on more important aspects of proposed changes. Finally, it will speed up the review process.
  • Implement fast generic Path finding in PySoy. I will attempt to implement a generic method of path-finding within the PySoy framework allowing one to simply give a Body a start and end point and know that the Body will find its way there along a reasonably optimal path.
  • Improve the bug tracker and the Rietveld integration The goal of this project is to improve the Python bug tracker at bugs.python.org and its integration with Rietveld.
  • Improving memory management of extension modules Using the embedded python interpreter within Py_initialize() and Py_finalize() or in the form of a sub-interpreter, should not leak memory inside the current process and also not share state of objects inside imported modules among other interpreters. Because of the current implementation of a majority of python extension modules, this is not the case. This proposal presents a possible refactoring of all the affected extension modules inside the python standard library to solve this deficiency.
  • More Controller Support for PySoy The main goal of the project is to complete the x11 input driver for wiimotes, implement another driver for ps3 controllers(sixaxis) and build a controller api encapsulating all controllers.
  • Need for scikit-learn speed The scikit-learn library has become very successful due to its complementary goals of being comprehensive and efficient. Many core pieces of code are highly efficient and compete with famous implementations. This project aims to bring a uniform standard of speed through quality iterations over the code of the less optimized corners of the codebase. We will also add a continuous benchmark in the spirit of speed.pypy.org that will make it easy to ensure that what's fast stays fast, and what's slow is easy to find.
  • Optimizing sparse linear models using coordinate descent and strong rules. Scikit-learn is a Python machine learning library that aims to be easy to use through a well designed interface. Dependencies are kept to a minimum and the extensive use of NumPy, SciPy and Matplotlib give great computing power and ease the understanding of the codebase. Data sets with far more features than samples are rapidly gaining on popularity and bring the class of linear models back to focus. This project will bring additional state of the art optimization routines for sparse linear models to scikit-learn and even further reduce dependencies. All code accepted to scikits.learn must include a high test coverage, documentation, examples and intensive benchmarking. Linear models are used for regression as well as for classification tasks. In both cases penalty terms can be used to obtain an implicit feature selection. To efficiently solve these penalized linear models is the main focus of this project. The here proposed improvements will be beneficial for a wide range of general problems and specialized domains such as gene expression analysis, compressed sensing, and sparse encoding.
  • Plots in pandas If successfully completed, the proposed changes would allow users of pandas access to advanced plotting techniques and nice graphical output through matplotlib.
  • Pygame: GUI toolkit I would like to reach a stable version of this GUI toolkit, meaning that it will be ready for general use and will be in a good, maintainable state.
  • Pygame: Improved Sprite and Scene system A re-design of the sprites module, creating a generalized code base to make it more flexible, and a new scene/director system to encapsulate sprites into distinct game sections and control their workflow.
  • Pylint improvements Pylint is currently widely used by Python Community but pylint is quite old (about 10 years old) and the need for modernization begins to emerge. First, a lot of tickets are open (currently 184 open tickets), some since several years. Moreover parsing pylint output is not so easy and several request for JSON output has been triggered. Finally improve python-3000 compatibility will help python developpers to move to python-3000.
  • PySoy: Improve testing, documentation and examples Pysoy currently lacks proper testing coverage, which harms the development of the engine and makes it easier to reintroduce past bugs while developing. Furthermore, the recent merger of the experimental branch has left some of the documentation out-dated and most of the example programs non-functional. This proposal is a project to write a set of unit tests and doctests, write new example programs based on the old ones to showcase Pysoy's capabilities and to overall fix minor bugs, clean up the codebase and improve documentation.
  • Soy Client for Android This project involves porting libsoy, and all of its dependencies over to the Android platform using the Android's Native Development Kit. Along the way there may be modifications required to libsoy and its dependencies. The end result will be a fully functional PySoy game client on Android.
  • statsmodels : estimating system of equations Statsmodels provides classes and functions for the estimation of many different statistical models, currently it has many features but no support for estimating system of structurally related equations. Since many statistical analyses (e.g., econometrics and biostatistics) are based on system of equations, my proposal is to provide the capability to estimate system of linear equations within the statsmodels module and provide tools for statistical tests.
  • statsmodels nonparametric estimation Statsmodels is a pure Python-based statistics and econometrics package that has drawn significant attention from applied practitioners from the fields of Finance, Economics and social sciences. Many of the basic econometric methods have been developed. In addition some impressive work has been done in developing the time series methods, VARs, and DSGE models to name a few. This GSoC I intend to develop the nonparametric capabilities of statsmodels by focusing in particular on data-driven bandwidth selection procedures, conditional and unconditional multivariate probability and cumulative density estimation and implementing popular nonparametric and semiparametric regression models.
  • Twisted : Expanded Endpoints Support Recently, two new APIs, IStreamServerEndpoint and IStreamClientEndpoint were added to Twisted, for specifying what address the servers should listen for connections and what address a client should connect to, respectively. But not all of the addresses that Twisted supports have this endpoint support added to them; presently endpoint support has been implemented for TCP, SSL and UNIX domain sockets. My project deals with adding more endpoint implementation to Twisted, some involving wrappers around the existing APIs (e.g. serial ports, standard I/O), others involving making fresh APIs where setting up connections was difficult before the addition of the endpoints (e.g. SOCKS and HTTPS proxies).