[Nhcoll-l] collector & determiner identities

Shorthouse, David davidpshorthouse at gmail.com
Thu Feb 18 09:46:42 EST 2016


All,

For the past several months, I have been experimenting with the
reconciliation of collector and determiner names in digitized specimen
data, using the Canadensys network of aggregated records as a small
case study.

As you may well know, the Darwin Core terms recordedBy, identifiedBy,
and indeed scientificNameAuthorship contain people names. To my
knowledge, no one has tried to reveal the human effort and the
implicit social networks using content in these terms, perhaps because
the expected content and format of these terms is so under-specified.
Maiden and married names, nicknames, variously abbreviated given
names, etc. are but a few examples that make ad-hoc reconciliation
heuristics & algorithms very difficult to do well. You can explorer my
first, albeit naive experiments at https://urldefense.proofpoint.com/v2/url?u=http-3A__collector.shorthouse.net&d=AwIBaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=CLFZJ3fvGSmDp7xK1dNZfh6uGV_h-8NVlo3fXNoRNzI&m=l4-dQ15l7SFEeJeMGIcnWDPyDO7p7IhxNa2gtk34qho&s=mZIjOrs0Vcy44oRHHM8RA14ALH6lG441BSzX1tdtxkI&e= .
There's potential here, but the next logical step is to lift this up
to something like a Darwin Core extension such that data managers at
the source have a mechanism to unambiguously link & share each of the
one-to-many, specimen-to-name pairs to human identity.

I am writing to inquire if anyone knows of any best practices guides
on how museum staff *ought* to record the names of collectors,
determiners, and other agents. Do any of you have an *agents* table in
your database? Have you attempted to link people names in your
specimen databases to their unambiguous identities and, by extension
to their scientific outputs like datasets & papers published? Does
anyone yet record ORCIDs, https://urldefense.proofpoint.com/v2/url?u=http-3A__orcid.org_&d=AwIBaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=CLFZJ3fvGSmDp7xK1dNZfh6uGV_h-8NVlo3fXNoRNzI&m=l4-dQ15l7SFEeJeMGIcnWDPyDO7p7IhxNa2gtk34qho&s=xTk1sgn3JolA6nk4mkV9oxQRdpMuKhWdddwtdnuA8tU&e=  for this purpose? Last,
does the ordering of determiner or collector names on labels contain
any semantic meaning as it does for papers? That's something I have
not yet considered & quite frankly scares me if this is important.

Hope this generates some discussion,

David P. Shorthouse


More information about the Nhcoll-l mailing list