GSoC/GCI Archive
Google Summer of Code 2011 Apache Software Foundation

Implementation of Nested Cross for Pig Latin

by Zhijie Shen for Apache Software Foundation

I am a graduate student from National University of Singapore, and am a fan of both open source and big data. This motivates me to apply GSoC'11 with Apache Pig, an OLAP system based on Hadoop. Currently, Pig Latin does not support nested "cross" statement inside "foreach", which can be a useful feature. One typical use case is flattening the records of the "cogroup" of two relations for each enumerated item in the nested block. This application details my plan to implement the nested cross.