GSoC/GCI Archive
Google Summer of Code 2015 Apache Software Foundation

Wider spectrum of data consumers/producers for Apache Samza

by R M for Apache Software Foundation

Apache Samza is a distributed stream-processing framework that can be deployed on top of Apache Yarn and uses Kafka as its main messaging system. The motivation of this project is to provide Samza the ability to consume/produce data from/to two very popular messaging systems, ActiveMQ and Amazon Kinesis. Although both systems are very different, Samza has a well-defined API and Samza's requirements are concretely reflected in the Samza-Kafka module which will be used as reference for the project