GSoC/GCI Archive
Google Summer of Code 2015

Python Software Foundation

License: Python Software Foundation License

Web Page: https://wiki.python.org/moin/SummerOfCode/2015

Mailing List: https://mail.python.org/mailman/listinfo/soc2015-general

Python is an interpreted, general-purpose high-level programming language whose design philosophy emphasizes code readability. Python aims to combine "remarkable power with very clear syntax", and its standard library is large and comprehensive. Its use of indentation for block delimiters is unique among popular programming languages.

The Python Software Foundation serves as as an "umbrella organization" to a variety of python-related projects, as well as sponsoring projects related to the development of the Python language.

Projects

  • "Astropy: Observation Planning and Scheduling Toolbox" Planning telescope observations using Python presently requires non-trivial amounts of coding in order to calculate rise/set times, airmass, moon separation, etc. I'd like to build an observational tool set using existing Astropy modules that would allow the user to quickly produce tables and plots with just a few lines of code. Users could modify the toolbox to suit their own needs with optional profiles. Other products may include IPython notebook examples with sample data and a simple GUI.
  • [Scrapinghub] Splash: Modernize Splash Port Splash to run on both Python 2.7 and Python 3.x, PyQt5 and Qt 5.x. Make it run on a more recent OS (ubuntu 14.04).
  • Astropy: Adding indexing capability to Table object The Table class is a central data structure in Astropy and is useful for manipulating tabular data. Therefore, adding indexing capability to Table would allow for increased efficiency in searching for individual rows in a table. My proposal will involve implementing indexing functionality for Table while ensuring that adding and removing records from each table works correctly in terms of the internal indexing system and that addition/removal operations suffer minimal slowdown due to indexing.
  • Astropy: background modeling for Gammapy Objective: implement the most successful background modeling methods largely in use by the gamma-ray community in the Astropy/Gammapy framework in two steps: 1 Implement tools to create background model templates from observations with no or only a few gamma-ray sources in the field of view. 2 Develop algorithms to estimate the background in observations containing gamma-ray sources using the model templates from the first step.
  • Astropy: Efficient and Precise Model Rasterization Fitting models to sources in images is at the heart of astronomical research. However, evaluating complex models precisely on large scales is computationally expensive. The primary goals of this project are to 1) Develop a more efficient model rasterization method for Astropy, and 2) Develop a "showcase" web app demo using pyJS9 to interface Astropy with JS9 in the browser.
  • Astropy: Observing planning/scheduling tools We propose a new astropy-affiliated package for planning and scheduling astronomical observations. This package will allow observers to enter a table of astronomical targets and a range of observing dates in order to retrieve (1) the sky-coordinates for their targets, (2) rise/set times, moon/sun separation angles, airmass ephemerides, and other essential positional criteria necessary for determining target observability, and (3) a roughly optimized observing schedule for the list of targets.
  • Binjitsu: Multi-arch ROP&&SIGRET Assistance&&Moar Exploits&&Format Strings The pwn challenges in CTF events are more and more difficult than before. And more kinds of different archs of challenges are bringed out. Tranditional ROP code maybe no more suitable, so new method for ROP or exploit shoud be presented. And finding bugs out mannully in CTF binary is a so painful work. So besides of the must to be done works, I also plan to take this opportunity to doing some exciting things, such as Automatic Exploit Generation, Automatic Binary Vuls Hunting etc.
  • Core-Python: bugs.python.org improvements This proposal emphasizes some representative feature requests on b.p.o. We categorize the features into four sets, namely patch, workflow, ui and cli. For each set, a week-based iterative development strategy is introduced to ensure the work progress. In the long term, this proposal will not only save the community an incalculable amout of time, but also encourage more and more developers to contribute to CPython due to the ease of development cycle.
  • CPython: RESTful API for Roundup RESTful API is a simple way to provide communication for the web services, which are used widely across the Internet. Our project aims to implement the RESTful API for Roundup, thus enable the opportunities for developers to create and develop new services and features, which can easily keep track of the data for users.
  • Generalized linear mixed models for Statsmodels I propose to implement Generalized Linear Mixed Models for Statsmodels. The ultimate goal would be to have something comparable to glmer in R. This will utilize existing GLM and MLE frameworks in Statsmodels. The project will be a success if stable results matching other packages can be obtained for commonly used models. Performance optimization and post-estimation will not be completed but the results of this project will serve as a foundation for future high-capability mixed GLM procedures.
  • GNS3 Docker support GNS3 is a network simulator that uses Dynamips, VirtualBox and QEMU to simulate network nodes. Docker is a highly flexible VM platform that uses Linux namespaces and cgroups to isolate processes inside what are effectively virtual machines. This would enable GNS3 users to create their custom virtual machines and move beyond the limitations of nodes that are network oriented and because of its lightweight implementation, would make it possible to run thousands of standalone Linux servers on GNS3.
  • Implementing advanced diffusion kurtosis imaging techniques in Dipy Diffusion-weighted MRI (dMRI) is a non-invasive technique that allows mapping of brain connections and quantification of tissue properties in vivo. Diffusion Kurtosis Imaging (DKI) is a new dMRI modality that overcomes the limitations of the dMRI techniques most widely used by the medical community. The objectives of this project is to implement DKI modules in Dipy. My vision is to improve the reliability and reproducibility of DKI studies by providing the first open source DKI modules.
  • Improve nonlinear least squares minimization functionality in SciPy The main goal is to implement the new solver for nonlinear least squares minimization based on modified dogleg approach. The solver will be able to work with large sparse Jacobian matrices and efficiently handle bound constraints on some of the variables. The second goal is to implement separable nonlinear least squares algorithm. This algorithm is applicable when an approximating function is sought as a linear combination of basis functions and it usually enjoys much faster convergence.
  • Improve VisPy The aim of the project is to improve VisPy, a rendering and plotting library for Python. This can be divided into three broad sub-tasks * Port interesting examples from other graphics libraries like Glumpy to VisPy * Bring high-level plotting constructs to VisPy. * Improve performance by implementing batching (called collections)
  • IMS - ERAS : Virtual-Reality based Telerobotics Virtual European Mars Analog Station (V-ERAS) is based on immersive real-time virtual simulations running on top of the Blender Game Engine (BGE). This V-ERAS project has two distinct components. First, it entails the teleoperative control of the Robot rover's motion via human body-tracking. Second, it involves streaming the 3-D camera video feed from the rover to BGE over the network and processing it into an Augmented Reality experience through a head-mounted Virtual Reality device.
  • Italian Mars Society: Enhancement of Kinect integration in V-ERAS The available virtual reality simulation of the ERAS Station allows users to interact with a simulated Martian environment using Aldebran Motivity, Oculus Rift and MS Kinect. However the integration of the latter technology is still not complete, and this project aims to enhance it in order to: -increase manageability of multiple Kinects -improve user navigation -reproduce users’ movements in real time -reduce data transfer latency by enhancing Tango integration -support touchless gestures
  • Italian Mars Society:Development of a Monitoring Front-End for ERAS station This project aims at developing a generic top level monitoring/alarming interface for the ERAS Habitat module to manage all relevant information. It will also have a sub-GUI for the health monitor where different data from V-ERAS will be shown on a preferential basis and the user will be able to modify the GUI as per the need. It will also aim to enable the GUI to properly interact with the Tango server and leverage the functionality of the existing Tango Alarm Systems.
  • Italian mars society:Utilization of the EUROPA Scheduler/Planner. The main aim of this project will be making some sort of Astronaut’s Digital Assistant which will take into account all the constraints and rules that has been defined and plot a plan of action. It will also schedule all the tasks for the astronaut such that job of the astronaut becomes easy. (Read further in the proposal)
  • Jython: Add full gc support to JyNI (and support ctypes) JyNI is a compatibility layer with the goal to enable Jython to use native CPython extensions like NumPy. It already supports a fair part of Python's C-API and is for instance capable of running basic TKinter code. A main show-stopper for a production-release is its lack of garbage collection. I propose a concept how native gc can be emulated for JyNI in a way that is almost 100% compatible with CPython's native gc. With gc-support given, I assume support for ctypes to be straight forward.
  • Kivy-Designer development, mobile integration and project maintainer When developing any kind of projects, teams seek for the most productive, easy and powerful tools. IDEs like Visual Studio and Qt spread rapidly as a consequence of the easy UI designer provided by them. With Kivy, it won't be different. Actually, kivy-designer is still a simple tools, with some bugs and few features, but it's a really useful tool. My task is to improve kivy-designer with new features, integrate it with buildozer, fix some bugs and become the maintainer of the project.
  • Kivy: Matplotlib Integration and algorithms for Ink support and processing Visualizations is the best way to understand information from documents. Traditional interaction techniques are used to do so. New means of interaction are available. Therefore, I propose to take advantage of Kivy multi-touch support to develop an API that will handle the generation and interaction of matplotlib widgets. In addition I propose an additional class to handle strokes over a canvas. This will allow developers to instantiate graphs and interact directly with gestures over them.
  • MNE-python. Improve interactive data exploration and visualization. This GSOC project aims to improve interactive data exploration and visualization of MNE-python. The improvements would be made for plotting functions, including plotting raw data, epochs, TFR and evoked responses. Some of the plotting functions would have interactive features for user to select/deselect plots, events etc. The visualization will rely mainly on matplotlib libraries.
  • MNE-Python: Signal space separation for denoising neural MEG data Signal source separation (SSS) is a powerful algorithm used to denoise neural signals acquired with magnetoenchaphalography (MEG). It uses Maxwell’s equations and spherical harmonics to separate measurements that originate from a space occupied by the MEG sensors (containing neural signals) from those originating outside this space (environmental noise). SSS has had a positive impact despite its purely proprietary implementation, so I plan to incorporate it into the open MNE-Python library.
  • MoinMoin : Improve existing themes MoinMoin is a wiki engine written in Python. MoinMoin 2.0 is currently in development and there's a lot to be improved in UI/UX of the wiki. Wiki interface can sometime be intimidating to new users. UI plays an important part in helping newcomers understand our interfaces. The aim is to make controls easier to understand for a user unfamiliar with the interface. The tweaks will provide a better UI and better user experience for the wiki users.
  • MoinMoin: Improving the Issue Tracker MoinMoin 2.0 has an existing implementation of a simple issue tracker. The aim of this project is to add more features to the issue tracker and improve its UI/UX.
  • MyHDL : SDRAM Controller SDRAM Controller is intended to offload cpu from memory read/write operations. In this project SDRAM Controller will be implemented with MyHDL and Python. It would illustrate how the powerful features of Python can be used to write compact and human-readable hardware designs for practical situations.
  • MyHDL: Refactor conversion MyHDL relies on parsing the abstract syntax tree (AST) of various objects in order to generate HDL. The target HDLs(Verilog/VHDL) have various lexical and semantic differences which make conversion a tricky process. In it’s current state, a large part of MyHDL’s conversion logic is duplicated across the various ast.NodeVisitors. The primary goal of this project is to make MyHDL’s conversion modules more robust.
  • NetworkX : Implementing Add-on system My proposal aims at designing and implementing an add-on system for networkx. Till now, networkx has been a pure python library. But there exists similar software packages written in compiled languages. They achieve greater performance compared to their equivalents in networkx. Along with with the implementation of two non-trivial software packages, my proposal also includes removing the Graph drawing package out of networkx core package and implementing it as an add-on.
  • NetworkX: NetworkX 2.0 and API The first part of this project deals with cleaning up the the existing graph API. The functions returning both lists and iterators will be changed to return iterators and thus the _iter suffix will be retired. This will be followed by a cleanup and reorganising the existing codebase as required and decided, and getting it ready for the 2.0 release. The next part will be to design and implement a new unified graph API.
  • pgmpy : Parsing from and writing to standard PGM file formats Pgmpy is a python library for creation,Manipulation and implementation of Probabilistic graph models.There are various standard file formats for representing PGM data. Pgmpy needs functionality to read networks from and write networks to these standard file formats.The project aims to improve the existing implementation of the file formats and implement a uai file format during the course of GSoC.
  • pgmpy: Adding feature for working with state names Right now pgmpy doesn’t have the feature to add state names for random variables. Therefore user’s have to work with some randomly assigned state indexes for the states of the random variables. In the case of huge networks it really get difficult to keep a trace of the indexes of the states. So, this project would focus on implementing a framework that would enable users to work entirely with state names.
  • pgmpy: Implementation of Approximate Algorithms Exact Probabilistic inference is generally intractable in complex models and often yields as a hard optimization problem. Here we will try to implement the two of the most famous approximate algorithms namely: the Linear Programming Relaxation method and the Cutting Plane method for inference in probabilistic graphic models.
  • pgmpy: Implementation of sampling algorithms for approx. inference in PGMs Currently pgmpy supports algorithms like Variable Elimination and Belief Propagation for exact inference. In large graphs, exact inference becomes computationally intractable. Thus, there's a need for approximate algorithms which answer the inference query with a lower time complexity. In this project I will implement the two most popular sampling algorithms for inference which fall under the general class of Markov Chain Monte Carlo methods: Gibbs Sampling and Metropolis–Hastings algorithm etc.
  • pgmpy: Implementing Dynamic Bayesian Networks in pgmpy One of the developing zones concerned with artificial intelligence is to build software, having capacity to draw conclusions based on external data. An interesting way to build conclusions is based on probabilistic dependencies embedded among the data set which are modelled via a graph called as a Bayesian Network(BN). Dynamic Bayesian Network(DBN) is therefore a BN that couples time measurement with uncertainty. This project plans to implement the DBN, along with inference properties.
  • Plone Foundation | Improved Text Transformation The proposal is basically divided into two parts : 1. Create a safe_html transform for HTML filtering 2. Improve tinyMCE integration for filtering
  • Pwntools : Sigret Exploitation Assistance and Porting shellcode to Pwntools As part of the project, I will be adding in support to Pwntools for generating valid SROP frames. I will also be adding in more shellcode to the Pwntools collection. Finally, I will also work on adding in more exploit samples to the pwntools writeup repository.
  • PyDy - Interactive Generation of System Create an interface for PyDy that will allow specifications of "Body and Joints" and "Points and Velocities" simultaneously. User will be able to use whatever way is more natural and it will transform between two. * Calculating inertial forces from rigid bodies, and * determining effects of forces on the coordinates, adding constraints etc. It will work a layer above PyDy and generate SymPy Mechanics code on the fly and thus, the equations of motion., GUI tools will follow
  • PyPy: Exploit superword parallelism on Array and NumPy traces The goal of this project is to enhance the trace optimizer of PyPy. By definition NumPy arrays are unboxed, homogeneously typed and continuous in memory (despite certain exceptions). The same is true for the array module in the standard library. The new optimizer will use this opportunity and exploit a SIMD instruction set (such as SSE,AVX) present on modern CISC processors (e.g. x86). This should lead to enhanced execution speed for arithmetic intensive applications that run on top of PyPy.
  • PyPy: Work on Python 3.x support PyPy is almost feature-compatible with Python 2.7. Where it's lacking is Python 3.x support. PyPy cannot provide compatibility with Python versions newer than 3.2. This is one of the reasons why PyPy's Python 3.x support hasn't seen much adoption among users. Missing adoption also means that there's little interest in contributing to PyPy's 3.x support and it's getting worse because Python 3.x is constantly evolving. A concentrated effort is needed to break this cycle.
  • Qtile : Better layout serialization Qtile currently does not serialize layout state across restarts and does not maintain order and focus between layout changes.The project aims to rectify this by modifying layout class . The projects is going to modify layout class to store additional information to maintain order and focus between layout changes. The ability to save & restore layout state will also be added to layout class.
  • Revamping astropython.org for Astropy The main components of astropython are forums, tutorials, a wiki of useful resources and storing of code snippets. It is currently slow, outdated and difficult to maintain.I wish to underline a comprehensive plan to revamp the website from its roots up making the application fast and easy to maintain and scalable across all devices.In addition to redesigning and revamping each of its original components,I propose certain new features that would be of tremendous productive value to the community.
  • scikit-image: Implementing a patent-free Face Detection algorithm. Viola-Jones Face Detection algorithm is one of the most successful algorithms in its field but it was patented and can’t be safely used. This is why I propose to implement an algorithm according to a popular paper that has a slightly better performance and accuracy than the original Viola-Jones algorithm. It is similar to Viola-Jones algorithm but avoids the patent by using another features.
  • Scikit-Image: Rewriting scipy.ndimage to cython Scikit-Image depends on scipy.ndimage for various operations (filters for example). scipy.ndimage has several advantages: it works for ndarrays and the performance is really good; however, it is written using the Python C API which makes the code hard to read and hence to maintain. Rewriting the module in Cython would improve its maintainability. A possible challenge is to keep the good performances of the current implementation.
  • scikit-learn: Enhance cross validation and related modules. 1. Make CV iterators data independent and provide a clean API with deprecations. 2. Group together/clean up/organize the model_selection module. 3. Multiple metric support to grid_search et al. 4. Extend sample_weight support to grid_search et al. 5. Generalized CV and early stopping 6. Introducing additional CV strategies for non trivial cv tasks 7. Make an extensive tutorial on CV. 8. Improve contributor documentation / Better docstrings.
  • scikit-learn: Improve GMM module Current implementation of Gaussian mixture model (GMM), variational Bayesian Gaussian mixture model (VBGMM) and Dirichlet process Gaussian mixture model (DPGMM) in scikit-learn have many problems. They suffer from interface incompatibility, incorrect model training and incompleteness of testing. In this project, I propose to clean up APIs of GMM, re-implement the updating functions of VBGMM and DPGMM based on well-acknowledged literature, and pass various test cases.
  • SciPy: scipy.stats improvements scipy.stats is one of the largest and most heavily used modules in Scipy. With the upcoming release of “Scipy 1.0” it must be ensured that the quality of this module is up to par and while the efforts to improve it have been ongoing, there are still some milestones to be reached in order to accomplish the goal. These milestones include a number of enhancements and addressing several maintenance issues.
  • Simplified Scrapy Add-ons Implementing a simplified interface to managing Scrapy extensions will give users a more “plug and play” experience. They will be able to enable and configure extensions in a simple and intuitive manner at a single entry point, while developers will gain easier control over configuration and dependency management.
  • statsmodels: GAM toolbox for statsmodels The aim of this project is to provide to the python statsmodels library a GAM toolbox similar to the one available for R in the MGCV library. In addition GAM related methods from recent research results will be included in the library. The work will be mainly focused on the statistical inference and will make statsmodels and python an open source and more powerful alternative for practitioners that often rely on softwares like SAS or Stata.
  • Statsmodels: Time Series Models using State Space Methods This project would extend time series analysis in Statsmodels by making use of the new Statespace module, which currently allows creation and estimation of arbitrary models by end-users but only SARIMAX as a built-in class. It would add: Unobserved components, Vector autoregression, Dynamic factors, and either the Fractionally integrated autoregressive moving average or Minnesota / Litterman priors for VARs. It will improve standard errors and add diagnostic statistics and linear constraints.
  • Sunpy - Support for analysis of Solar Energetic Particles This projects implements a common class SEPLightCurve and its three subclass (ERNE/ACE/STEREO-LightCurve) for the three type of instruments that provide SEP data. It reads in SEP data as a lightcurve object and implements basic analytical, visualization and comparison methods like Visualisation as time series, as energy spectrum, of intensity ratios of different particle species, comparing the SEP observations with other light-curve type data. It adds capability for solar event recognition.
  • SunPy – Full support for IRIS, 4D Cubes implemented using Ginga as the GUI The satellite 'IRIS' and other telescopes use a 4D cube data structure to save data. Currently, SunPy does not have support for this structure which this project aims to remedy. Proposed plan - 1) Tweak and merge the Cube module. 2) Create a GUI plugin to view and perform basic operations on 3D/4D data for Ginga using CRISPEX as a reference. The primary goal will be to merge the Cube module which will bring the support for IRIS, SST, EIS and every other 3D+ data format to SunPy.
  • SymPy - Implementing polynomial module in CSymPy CSymPy currently lacks a polynomial module which is very essential in achieving CSymPy's goal of being the fastest CAS ever. Having a polynomial module is a core concern and implementing a fast module also help in achieving a fast series module and other modules. Once implemented, CSymPy will be more capable as a fast optional SymPy core which I think is good to ship before 1.0 at the same time being a powerful CAS on it's own.
  • Sympy - Improving the series package and limits Sympy currently lacks a proper structure for handling and manipulating series. All the series related functionality is defined as methods in Expr. I plan to give the series package a concrete structure for future development and improvement. I plan to do the following over the summer. 1 Sequence classes for defining sequences of coefficients 2 Classes to represent series in general 3 Implement Formal Power Series, using the above implemented structure 4 Computing limits of sequences
  • SymPy : Make Sage use CSymPy as a symbolic engine CSymPy is a fast symbolic manipulation library written in C++. Sage uses Pynac as its main symbolic engine. CSymPy is much faster than what is already there at Sage. Several benchmarks from http://wiki.sagemath.org/symbench run on CSymPy suggests that CSymPy is 6 times or more faster. A big long term goal would be to have CSymPy become the default symbolic manipulation engine of Sage and this GSoC project proposes to have it as an alternative symbolic manipulation engine for Sage
  • SymPy Improving Solvers : Extending Solveset The Project aims at improving the current Equation solvers in SymPy by: 1). Making the code more robust, modular, and approachable for developers. 2) Improving Mathematical robustness of new solvers module. 3) Implementing Complex Sets: Representing infinite solutions in the argand plane. 4) Implementing solvers for Linear System of Equation, Transcendental solvers, etc 5.) Implementing Differential Calculus Methods.
  • SymPy: Fast Sparse Series Expansion My proposal is to implement a fast, class-based, sparse series expansion infrastructure in Sympy and CSymPy, along with a fast Polynomial module for CSymPy, that can handle series related operations. The proposed series and Polynomials module for CSymPy will be among the fastest and comparable in speed with Mathematica, Maple and Ginac. The new Series class in Sympy will significantly improve the performance of series expansion.
  • SymPy: Improving assumptions in SymPy This project aims to remove the old assumptions system, replacing it with the new one. I've to pick up the work that has already been done by Aaron on the assumptions. Here's a brief overview of both assumption systems: Old Assumptions: Here attributes are bound to variables. The expression and the attributes are tied into the same object. New Assumptions: Here the variables and attributes are maintained separately. The separation of facts from expressions enables rich logical inference.
  • Theano: Allow re-generating compiled function and improving OpFromGraph This proposal will improve Theano in the following aspects. Firstly, we will provide user with an interface to re-generate and slightly modify compiled function and therefore save their time to recompile almost same function. Secondly, by improving OpFromGraph, optimizor of Theano will be able to reuse optimized FunctionGraph, whilch will save time.
  • Theano: Interactive visualization of Computational Graphs Theano is a popular library for defining and differentiating mathematical expression. Currently, Theano only supports static text and image output for debugging, which makes it hard to analyze complex models. Here, I propose to extend Theano by a module to interactively visualize graphs, which will assist debugging by dynamically a) arranging, collapsing, and editing of nodes, b) panning and zooming to different regions, and c) highlighting additional information by mouseover events.
  • Theano: Lower theano function call overhead When using theano, a function will be converted to a c version code and then compiled by c compiler in order to improve the performance. But there are overhead when calling the compiled functions, which is slow, especially when the code is working on a small graph. I’ll try to solve this problem, make overhead lower down.
  • Tissue classification to improve tractography. Diffusion Magnetic Resonance Imaging (dMRI) is used to create visual representations of the connections of the brain also known as tractography. Research has shown that by using a tissue classifier more accurate representations of the underlying connections are calculated. The goal of this project is to generate tissue classifiers using dMRI or a different MRI modality e.g. T1-weighted MRI (T1). I will implement popular segmentation algorithms using T1 as well as new ones using dMRI data.
  • Visualizing static and dynamic networks in Vispy Vispy is a high performance 2D/3D visualization library which uses the GPU intensively through OpenGL. This Google Summer of code project proposes to add an API to Vispy to visualize static and dynamic (boolean, linear and Bayesian) networks. Additionally, I would like to add support for calculating projections onto a given 2D or 3D subspace on the GPU, which helps when you want to visualize higher dimensional data.