Skip to content

data-lessons/OpenRefine-nhcdata-lesson

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OpenRefine-nhcdata-lesson

OpenRefine lesson for Natural History collection data

Data set notes.

  • This data set is derived from iDigBio natural science collections specimen data. This data file was modified specifically for use with OpenRefine.
    • Some taxon names have errors introduced.
  • These modifications were made in order to illustrate some features of Open Refine.
    • Errors were added to the taxon names (scientificName field), to demonstrate OpenRefine's ability to find likely mis-entered data.
    • These errors can be found using clustering algorithms on the scientificName column, showing the power of the algorithms to find discrepancies quickly and making it simple to fix all issues found.

Options.

  • For someone already familiar with OpenRefine, it would be a very simple matter to substitute a different data set, as desired.