GSoC/GCI Archive
Google Summer of Code 2014

CCExtractor development

License: GNU General Public License version 2.0 (GPLv2)

Web Page:

Mailing List:

If you've ever downloaded a .srt file to get subtitles for anything, most likely CCExtractor was used to produce it. To be clear, while CCExtractor job is to produce these transcript files from video streams, this year we need students interested in a number of more generic things, such as multithreading, networking... having an interest in subtitles in particular, or video encoding is a plus, but not absolutely needed. The goal of our GSOC application is to make the best possible closed captioning tools available for everyone. While closed captioning used to be a niche area, now that everyone can create and distribute content, subtitles are more important than ever - yet most tools are proprietary.


  • CCExtractor bigger, better and developer friendly Making CCExtractor usable for other developers is very important. Why reinvent the wheel if someone else already built a nice library or a nice program for it? This is the main goal of this proposal, making CCExtractor better so that it can be easily verified that it’s correctly working (through the Testing program), making it embeddable in other programs (refactoring + “librarizing”) and finally implementing a black-spot on the supported standards (the DVB subtitles).
  • Integration with FFmpeg and support of DVB subtitle FFmpeg Integration CCExtractor has built in parser (demuxer and decoder) to separates out the Subtitle Streams. Currently CCExtractor have few parser. The project Idea aims to integrate ffmpeg with ccextractor , so that subtitle streams from all the ffmpeg supported file formats are available to ccextractor for processing. Support for DVB: CCExtractor does not have support for DVB subtitle, after adding DVB support CCExtractor would be able to extract Subtitle even in DVB format.
  • Proposal Proposal describes how I'd approach each idea