GSoC/GCI Archive
Google Summer of Code 2014

Twitter

License: Apache License, 2.0

Web Page: https://github.com/twitter/twitter.github.com/wiki/Google-Summer-of-Code-2014

Mailing List: https://groups.google.com/forum/#!forum/twitter-gsoc

The Twitter Open Source office is responsible for maintaining a healthy relationship with the open source development community. We open source a ton of projects and participate in a variety of communities: http://twitter.github.io/

Projects

  • [Netty] Pluggable Event Loop Algorithm and Channel Migration In this proposal I discuss an approach that will allow developers to implement custom algorithms to decide by which event loop an incoming connection should be handled as well as how existing connections can be migrated to another event loop in cases where an event loop gets too busy.
  • Android Support For Pants In order to add Android support we will need to determine which new Targets will need to be added to Pants. We will also need to construct new Tasks to support those targets. In this proposal I quickly summarize the Android build lifecycle to show the steps we are going to automate. From there I list and briefly explain the tasks and targets, indicating dependency when necessary.
  • Implementing SMTP client for Finagle At this moment Finagle supports many protocols in a fully async way. However, there is still no async email support, which leads to using blocking Java libraries like javamail and decreases efficiency. My proposal is to implement asynchronous SMTP client that would allow finagle users to build apps that can send email without blocking, which would benefit Finagle in a whole.
  • Pure Zookeeeper client with Finagle Finagle is a RPC system written in Java and Scala focusing on high-concurrency and asynchronous communication, designed for distributed systems. Zookeeper is a distributed coordination system, it exposes services like naming, service discovery and synchronization. It can be used to define specific mechanisms, for example leader election, barrier or queue algorithms, many important projects are built on top of it, such as Hadoop and Kafka. Full abstract in the Proposal.
  • Use zero-copy read path in new Hadoop APIs The idea is to use new Hadoop API to avoid unnecessary bytes copy in the reading path of Parquet. The idea will bring a performance gain in the client when scanning the file. To work on older version of hadoop, an abstract layer will implemented.
  • Various compression codecs for Netty Netty is an asynchronous event-driven network application framework for rapid development of high performance protocol servers and clients. Compression codecs will allow cutting traffic and creating applications, which are able to transfer large amounts of data via the Net even more effective and faster. While various compression codecs will allow making the optimal choice for a specific problem. Documentation and examples will help users to use any compression codecs in their projects.
  • Wikipedia pages analysis using Cassovary An analysis of various measures of wikipedia pages graph. Including: nodes centralities, clustering coefficients, triangles count, nodes similarities. Development of Cassovary library to allow this functionality. Using Wikipedia graph and Cassovary for entity extraction of texts.