Citoid: Create dataset in JSON of different possibilities for user-entered IDs to use in testing
completed by: Anish V.
You don't need to get citoid running in order to do this work, but you should use gerrit to add your file to the /test_files directory in the citoid repository using git, see https://www.mediawiki.org/wiki/Citoid#Get_the_code
Users may try entering a URL, ISBN, ISSN, PMC, PMID, MID, DOI and possibly other IDs not listed here to identify a magazine, article, book, or webpage. We need an Array of possibilities to make sure a) we are correctly identifying which ID it is and b) we are correctly extracting the ID.
Create a JSON file containing examples of the following IDs: (ISBN, ISSN, PMC, PMID, DOI) as users might enter them, with different capitalisation, spaces i.e. (true positives). You might try thinking about where a user might be copying them from (i.e. amazon.com or sciencedirect.com) or how they might be typing them in manually (i.e. from a book) Also make an Array of identifiers that looks like your chosen ID but are not valid (true negatives). You might want each ID in a separate file or you might choose to put them all into one JSON object.
Students are required to read Wikimedia's general instructions at https://www.mediawiki.org/wiki/Google_Code-in_2014#Instructions_for_GCI_students first.