[Nhcoll-l] collector & determiner identities

Paul J. Morris mole at morris.net
Fri Feb 19 14:13:18 EST 2016


On Fri, 19 Feb 2016 16:05:19 +0000
"Macklin, James" <James.Macklin at AGR.GC.CA> wrote:
> Outside of the various flavours of collection management systems
> there really needs to be a standardized controlled vocabulary of
> these various "agents" which is maintained and some form of web
> services available. 

Multiple lists are also being developed in the library world, for cases
where biologists have also been authors of works that have been
cataloged in libraries, lists like VIAF may be able to provide
identifiers for those people.

> In botany, we have had at least one publically
> accessible list of collectors and authors which has been maintained
> and curated for decades, the Harvard University Herbaria Index of
> Botanists: http://kiki.huh.harvard.edu/databases/botanist_index.html
> This list has many of the attributes you mention. The agent fields
> included are quite comprehensive and include reference to more than
> the "who" including "what, where, when" See the Carl Linnaeus example
> here...
> 
> http://kiki.huh.harvard.edu/databases/botanist_search.php?mode=details&id=92

Very similar functionality is also embedded in Symbiota.  Any Symbiota
instance is capable of maintaining a curated list of agents.  For an
example, see: 

https://urldefense.proofpoint.com/v2/url?u=http-3A__symbiota4.acis.ufl.edu_scan_portal_agents_agent.php-3Fuuid-3Ddd3d4140-2D1b18-2D4956-2D9b0a-2D82d47bad402e&d=AwIFaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=CLFZJ3fvGSmDp7xK1dNZfh6uGV_h-8NVlo3fXNoRNzI&m=NAScn-BLfUAp9GtkNiQlQhebfwytur4HlAf-GtWhM5c&s=WWrObC4M2hmLWM64mJeGjgzbI07oofI4p8O-xbrcg68&e= 

The underlying data structures are an agent table (that can take agents
typed as individuals, teams, or organizations, each with different
properties), linked a names table that holds an arbitrary number of
names of known type for each agent, also linked to a table of typed
links to external resources about the same agent, also linked to a
table of typed relationships (e.g. parent/child, teacher/student)
amongst agents, also linked to a table that can hold patterns for
collector/field number series used by that agent, and for teams, links
between the agent record for the team and individual records for each
member agent.  Agents can, among other attributes, have a biography,
taxonomic specialties, and collections where they are known to have
deposited material.  

Drawing from Arctos/MCZbase, agent records can be tagged as bad
duplicates of another agent record, and merged, and agent records can
be flagged as curated (allowing for deduplication of agent records into
a marked authoritative record.  Agents can also be flagged as Not
Otherwise Specified, alowing a "Sowerby" not otherwise specified record
be linked to data records where it is known that one of the Sowerby
family was the collector, but it isn't known if it is Sowerby I,
Sowerby II, Sowerby III, etc.  

> You will also note that GUIDs are assigned to each agent. The
> AppleCore group are recommending this Index as a controlled
> vocabulary for authors and collectors associated with specimen
> records (note that this group is actively working on the AppleCore
> best practice again; new home to be announced soon).

The Harvard List of Botanists is currently capable of pointing at one
external link for an agent (e.g. a VIAF id, or an ORCID id, or a
wikipedia page).  Symbiota agents can be linked to an arbitrary number
of typed external identifiers.  

Here's a record in Symbiota with a link to an ORCID id:
https://urldefense.proofpoint.com/v2/url?u=http-3A__symbiota4.acis.ufl.edu_scan_portal_agents_agent.php-3Fuuid-3D244d4fba-2Dbc75-2D4cc3-2Da1ea-2Daa037d55e532&d=AwIFaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=CLFZJ3fvGSmDp7xK1dNZfh6uGV_h-8NVlo3fXNoRNzI&m=NAScn-BLfUAp9GtkNiQlQhebfwytur4HlAf-GtWhM5c&s=1HozFdNLV8OQY36b-JwA0z8sGP2X4r-AlQOELypuFDM&e= 

If you request this record with an Accept header of text/turtle or
application/rdf+xml you will get a machine readable RDF response
instead of the human readable html.

<urn:uuid:244d4fba-bc75-4cc3-a1ea-aa037d55e532>
   foaf:isPrimaryTopicOf <https://urldefense.proofpoint.com/v2/url?u=http-3A__orcid.org_0000-2D0001-2D7089-2D7018&d=AwIFaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=CLFZJ3fvGSmDp7xK1dNZfh6uGV_h-8NVlo3fXNoRNzI&m=NAScn-BLfUAp9GtkNiQlQhebfwytur4HlAf-GtWhM5c&s=-muJZFHId7mBW6ZAdxNB50ySJ8UhEgHecnHuj6PTe-Q&e= > ;
   a foaf:Person ; 
   bio:biography "Weevil (Coleoptera: Curculionoidea) evolutionary
biologist and systematist." ; 
   foaf:name "Dr. Nico Franz *" .

Both the Harvard List of Botanists and an indexed snapshot of a (short)
curated list of Entomologists in the SCAN Symbiota instance are used in
the Kurator project as data sources for a workflow actor that compares
collecting event dates in occurrence records with birth/death or
start/end dates for collectors.  

https://urldefense.proofpoint.com/v2/url?u=http-3A__wiki.datakurator.net_web_FP-2DAkka-5FUser-5FDocumentation-23Date-5FValidator&d=AwIFaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=CLFZJ3fvGSmDp7xK1dNZfh6uGV_h-8NVlo3fXNoRNzI&m=NAScn-BLfUAp9GtkNiQlQhebfwytur4HlAf-GtWhM5c&s=paGPG8325_w80-eJFH7TLw18ETRmspjTokJ4nsp4R0s&e= 

Building, curating, linking, and sharing lists of agents who have been
involved in natural science collections has lots of potential.
Curating this data, however, takes a lot of effort.  

-Paul
-- 
Paul J. Morris
Biodiversity Informatics Manager
Harvard University Herbaria/Museum of Comparative Zoölogy
mole at morris.net  AA3SD  PGP public key available


More information about the Nhcoll-l mailing list