Hadoop Indexing and Concept-Space Disambiguation Models for DBpedia Spotlight
Chris Hokamp
Abstract
My project proposal is divided into two sections: (1) creating a Hadoop indexing system for DBpedia Spotlight and (2) implementing three novel approaches to disambiguation: Latent Semantic Analysis (LSA), Explicit Semantic Analysis (ESA), and Salient Semantic Analysis (SSA). These concept-space disambiguation modules will be used to rank the possible URIs for spotted entities based on context.
Additional Information
This project implemented an indexing system for DBpedia-Spotlight using Apache Pig, and two approaches to disambiguation: Latent Semantic Analysis, and Explicit Semantic Analysis.
Code samples
| File name | Size | Date submitted |
|---|---|---|
| Chris_Hokamp.tar.gz | 28.9 KB | September 13 2012 23:01 UTC |
