GSoC/GCI Archive
Google Summer of Code 2013

Scaffold Hunter

Web Page: http://scaffoldhunter.sourceforge.net/wiki/doku.php?id=project_ideas

Mailing List: mailto:scaffoldhunter-users@lists.sourceforge.net

Scaffold Hunter is a tool for the visual analysis of data sets with a focus on data from the life sciences, aiming at an intuitive access to large and complex data sets. The tool offers a variety of views, e.g. graph, dendrogram, and plot view, as well as analysis methods, e.g. for clustering and classification. Scaffold Hunter is meant to be a reusable open source platform for different application areas, and offers flexible plugin and data integration mechanisms to allow adaption to new fields and data sets. Scaffold Hunter is used worldwide in research, both academic and commercial, and for teaching at the Technical University of Dortmund, Germany, and the University of Sydney, Australia. Example applications include drug discovery and medical image retrieval. The software is implemented in Java.

Projects

  • Lappenschaar: Ramachandran Plots, Treemaps, and Heatmaps Scaffold hunter is a tool used in the life-science department for the analysis of (large) data-sets. Currently it has the following views: the scaffold-tree, a way to hierarchical classify molecular data sets. A table view, a dendrogram and a simple 2d or 3d plot. This proposal writes about a plan to extend scaffold-hunter with two more views. The first part of this proposal explains the plan on how to do this, the second half talks about the already known knowledge of the proposer.
  • Visual Feature Selection/Dynamic Visualization Current state of the art:           Scaffold Hunter (SH) was designed as a Visual Analytics tool for chemical space. Recently, a plugin was developed that leverages SH for the visual analysis of content-based medical image retrieval (VAMIR). Different from traditional approaches relying on textual annotations and queries, content-based image retrieval (CBIR) exploits visual features that in the medical domain may comprise segmented anatomical or pathological regions, their spatial relationships, volume and texture as well as Wavelet- and Fourier-transformations. A typical CBIR framework presents the user a result list sorted according to some similarity metric that accumulates all features used for image comparison. This way, no insight is given on which and how features have contributed to the ranking. In a novel approach, the VAMIR plugin allowed SH to visualize the outcome of a CBIR algorithm in a way that enables the user to make judgements about a query’s similarity to other database entries with respect to selected features.           Motivation for this project:         The tool’s performance is promising, but VAMIR is still at a prototype stage. A current shortcoming consists in the static nature of the feature set that is visualized, which needs to be specified in a config file before execution. Also, no hint is given as to which feature set is likely to profit from a visual analysis (with respect to the Visual Analytics core mantra of “discovering the unexpected”). Implementation of methods to dynamically select a set of features to be visualized as well as computerized reasoning behind a meaningful starting set of features would mean an essential step away from VAMIR’s prototype stage towards a deployable application.