Skip to content

sameer-pradhan/semeval-2015-task-14-scoring

Repository files navigation

#############################################################################
#
# These are scripts for the SemEval 2015 evaluation metrics.
#
# Authors: Noemie  Elhadad, Sharon R. L Gorman 
#          Columbia University
#          Sameer Pradhan, Guergana Savova
#          Harvard University
#
##############################################################################

scripts:

# semeval-2015-task-14-task-1-eval.pl

  Script evaluating F-score for Disorder Identification Task. A single 
  execution of the script calculates two scores; 1) strict F-score  
  and 2) relaxed F-score.
  
  * Input Parameters:

    -input (prediction directory)
    -gold (gold standard directory)
    -n (specify name of team)
    -r (specify 1, 2 or 3 for which run)
    -trace (optional: 1 turns trace on; 0 or omit option 
                      to turn trace off )
    
  * Output 

    file reporting evaluation metrics for run
    
    Format of output file name: 
      team_task1_run1.out 
      team_task1_run2.out
      
  
    if -trace 1 option is given trace file is created in 
    same name as output file  with .trace file extension.
    example: team_task1_run1.trace 
      
      

  * Example usages:

    to specify run 1:
      ./semeval-2015-task-14-task-1-eval.pl  -n team -r 1  -input team_dir -gold gold_dir
    
    to specify run 2:
      ./semeval-2015-task-14-task-1-eval.pl  -n team -r 2  -input team_dir -gold gold_dir
    
    to specify run 3:
      ./semeval-2015-task-14-task-1-eval.pl  -n team -r 3  -input team_dir -gold gold_dir
  
    to execute run 1 with trace on:
      ./semeval-2015-task-14-task-1-eval.pl l -n team -r 1  -input team_dir -gold gold_dir -trace 1
  

  
  
# semeval-2015-task-14-task-2-eval.pl

  Script to evaluate the slot filling task for the two sub tasks. 
  Computed metrics are per-slot overall accuracy, overall weighted 
  accuracy, and overall unweighted accuracy. 
  
  * Input parameters
    -input (prediction directory)
    -gold (gold standard directory)
    -n (specify name of team)
    -r (specify 1, 2 or 3 for which run)
    -t (specify A or B for task)
    -trace (optional: 1 turns trace on; 0 or omit 
            option to turn trace off)
    
  * Output
    file reporting evaluation metrics for run and task.
    Output includes, F*Accuracy,  F*Wt_Accuracy and 
    slot Weighted Accuracy.
    
    Format of output file name: 
      team_task2A_run1.out   (for: task A run 1)
      team_task2A_run2.out   (for: task A run 2)
      team_task2A_run3.out   (for: task A run 3)
      team_task2B_run1.out   (for: task B run 1)
      team_task2B_run2.out   (for: task B run 2)
      team_task2B_run3.out   (for: task B run 3)
    
    if -trace 1 option is given trace file is created in 
    same name as output file  with .trace file extension.
    example: team_task2A_run1.trace
      
  * Example usage:

    to specify run 1 task A:
      ./semeval-2015-task-14-task-2-eval.pl -n team -r 1 -t A  -input team_dir -gold gold_dir
      
    to specify run 1 task B:
      ./semeval-2015-task-14-task-2-eval.pl -n team -r 1 -t B  -input team_dir -gold gold_dir
      
    to specify run 2 task A:
      ./semeval-2015-task-14-task-2-eval.pl -n team -r 2 -t A  -input team_dir -gold gold_dir
      
    to specify run 2 task B:
      ./semeval-2015-task-14-task-2-eval.pl -n team -r 2 -t B  -input team_dir -gold gold_dir
    
    to specify run 3 task A:
      ./semeval-2015-task-14-task-2-eval.pl -n team -r 3 -t A  -input team_dir -gold gold_dir
      
    to specify run 3 task B:
      ./semeval-2015-task-14-task-2-eval.pl -n team -r 3 -t B  -input team_dir -gold gold_dir
      
    
    
    to execute run 1 task A with trace on:
      ./semeval-2015-task-14-task-2-eval.pl -n team -r 1 -t A  -input team_dir -gold gold_dir -trace 1
    
  
  
  

#############################################################################
!!!!! Important information regarding input file format and naming !!!!!
#############################################################################
1) Both gold standard and prediction input must contain a '.pipe' in the name.
  
   Example:
   correct:   00235-001028-DISCHARGE_SUMMARY.pipe 
              00235-001028-DISCHARGE_SUMMARY.pipe.txt
              
   incorrect: 00235-001028-DISCHARGE_SUMMARY.txt
   
2) Document name for a prediction must map to the identical document name in 
   the gold standard to test for a span match. This is obvious, but just 
   stating it for emphasis.
 
   Example: 
   correct:

   disorder 1 Prediction file, DocName field: 00235-001028-DISCHARGE_SUMMARY.txt
   disorder 1 gold standard file, DocName field: 00235-001028-DISCHARGE_SUMMARY.txt
   
   incorrect:

   disorder 1 Prediction file,     DocName field: 00235-001028.txt
   disorder 1 gold standard file,  DocName field: 00235-001028-DISCHARGE_SUMMARY.txt

3) Discontinuous disorder spans must be listed in increasing order to consider an exact 
   match to gold standard.

   Example:
   correct: 1006-1010,1027-1028

   incorrect: 1027-1028,1006-1010
   
 4) Script semeval-2015-task-14-task-1-eval.pl  expects the following input format for both 
    gold standard and prediction files:
    
    DocName|Diso_Spans|CUI|
  
    There is a single pipe field delimiter.
  
 5) Script semeval-2015-task-14-task-2-eval.pl expects the following input format for both gold standard 
    and prediction files:
    
    DocName|Diso_Spans|CUI|Neg_value|Neg_span|Subj_value|Subj_span|Uncertain_value|Uncertain_span|Course_value|Course_span|Severity_value|Severity_span|Cond_value|Cond_span|Generic_value|Generic_span|Bodyloc_value|Bodyloc_span|
 
    As shown above there is a single pipe field delimiter with an extra pipe after the last field.
 
 6) Just a reminder... both parameters -input and -gold expect a directory not a single file.


# validate-semeval-2015-task-14.py

This script is a validation script that ensures that the packaged
system outputs meets the required directory structure and the .pipe 
files meet the criteria for the scorer. It does the following checks:

For the directory structure:
---------------------------
1. There are exactly 100 .pipe files in the leaf directories (which is the 
   total number of files in the test set)
2. The TEAM-ID matches the one specified on command-line
3. The run IDs and versions match the required convention:
   - There can be at most 3 runs
   - There can be at most 99 versions per run.

For the .pipe files:
-------------------
1. The filename inside the .pipe file is the same as the filename of the .pipe file
2. The character-offsets have the right signature and are in increasing order
3. The attributes have one of the legal, possible normalized values
4. There are a total of 19 columns

About

SemEval-2015 Task 14 scoring scripts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published