Generating the Big Mechanisms Evaluation Corpus

Preliminary work for the program in downloading the full text of documents from the PMC lists .

June 2016

  1. Dry Run II Papers from MITRE

July 2015

  1. Two-Hop Ras BioPax Model

May 2015


April 2015

  1. Open Access Pathway Logic Papers and Figures
  2. Extended Coprecipitation Frames v2

January 2015

  1. Coprecipitation Frames v2
  2. List of Experimental Motif Types + Definitions
  3. KEfED Database Construction

December 2014

  1. KEfED Modeling of Coprecipitation Ras Papers
  2. Initial Extraction Study of Results-Based Epistemics

October 2014

  1. Pathway Logic Experiment Types
  2. Building a Database of Observations from Result Text
  3. Deploying the BioScholar System
  4. Reading Against a Model of Experimental Evidence

August 2014

  1. Developing NL Annotations for KEfED Elements
  2. Epistemics and Fragments
  3. Generating the Big Mechanisms Evaluation Corpus
  4. A Generative Story for Scientific Text from Experimental Data
  1. BigMech Wiki Instructions + Redundancy

    This link on the Big Mechanisms wiki provides a list of 1,741 PMC id values. Since some of the articles occur in more than one query, only 840 of these are unique. Here is a list of these unique PMC id values.

  2. Preliminary Corpora

    Consistent with this list, we have attempted to download these documents to provide to the community as a shared resource. Pending additional bug checking, we now provide this as a resource for the community.

Given the latest lists of the open access xml from PMC, we were able to download 812 of these 840 documents .

  1. Corpus Organization

    • We organize directories of the corpus Journal/Year/Volume
    • space characters are replaced with _ in Journal and Volume names.
    • Each article’s files are named according to it’s PubMed ID (pmid)
    • [pmid].pdf - The pdf file
    • [pmid]_pmc.xml - The xml full text

We host the files for this on Amazon, so that they may only be downloaded from links on this site.