The SciKnowMine Triage Application

We present here a user manual for running and maintaining a web-based system for peforming document triage given a corpus of PDF files. We will describe processes for installation, execution and maintenance of the system.

  1. Installation Manual
  2. System Organization
  3. Command Line - Set up
  4. Command Line - Working with Data
  5. Command Line - Reporting Functions
  6. Command Line - Deleting Data
  7. Command Line - Machine Learning
  8. Command Line Tools - Running Experiments
  9. Web Application - Running the System
  10. Web Application - Extracting text using LAPDF-Text
  11. Web Application - Performing the triage task
  12. Web Application - The Base Digital Library

6. Command Line Tools - Deleting data

We have three commands to edit data from the system

The deleteTargetCorpus will remove all traces of a given target corpus from the system.

deleteTriageCorpus -db DBNAME -l LOGIN -p PASSWD -targetCorpus TARGET 

 -db DBNAME           : Database name
 -l LOGIN             : Database login
 -p PASSWD            : Database password
 -targetCorpus TARGET : Target Corpus Name

The deleteTriageCorpus will remove all traces of a given triage corpus from the system.

deleteTriageCorpus -db DBNAME -l LOGIN -p PASSWD -targetCorpus TARGET

 -db DBNAME           : Database name
 -l LOGIN             : Database login
 -p PASSWD            : Database password
 -triageCorpus TRIAGE : Triage Corpus Name

The deleteTriageScoresBasedOnCodefile uses a code file (a list of formatted pmid_A.pdf file names) to remove paper's association with a given triage corpus.

deleteTriageScoresBasedOnCodefile -codeList CODES -db DBNAME -l LOGIN -p PASSWD -triageCorpus CORPUS

 -codeList CODES      : Encoded files
 -db DBNAME           : Database name
 -l LOGIN             : Database login
 -p PASSWD            : Database password
 -triageCorpus CORPUS : Triage Corpus name