GSoC/GCI Archive
Google Code-in 2014 Wikimedia Foundation

Citoid: Add PMID to all citation objects with a DOI using PubMub API

completed by: ☃ unicodesnowman

mentors: Mvolz

citoid is a Node.js application (written in Javascript) that retrieves information about a webpage, book, journal article, etc. given a URL to the webpage or some other identifier, like DOI (digital object identifier).

It uses another open source project, Zotero's translation-server, also written in Javascript, to do a lot of the work. Doing this work may involve reading both citoid and translation-server code. In order to get citoid working on your computer, you'll need to download both Node version 10.0 (for citoid) and xpcshell version 29.0 (for Zotero) to get both of them working. Citoid is a very new project so the code is rough around the edges and may change a lot- but that means there's lots of code to write! There are installation instructions and more information available at https://www.mediawiki.org/wiki/Citoid

See https://phabricator.wikimedia.org/T1088 for the corresponding bug report. In method addPMID() in the file lib/zotero.js, currently the PMID is extracted from the extra field in the citation object. However, the PMID is only found in the extra field when the metadata is being extracted from a link directly from http://www.ncbi.nlm.nih.gov/ , but not if a different URL to the paper is used, such as one on the publisher's website. If a citation object contains a DOI, you can use the DOI to look up the PMID and add that to the citation using the PubMed API: http://www.ncbi.nlm.nih.gov/pmc/tools/id-converter-api/

Students are required to read Wikimedia's general instructions at https://www.mediawiki.org/wiki/Google_Code-in_2014#Instructions_for_GCI_students first.