[Nhcoll-l] collector & determiner identities

Bentley, Andrew Charles abentley at ku.edu
Thu Feb 18 13:50:16 EST 2016


David

Here at KU we use Specify software in which there is a single agent table that handles all people, organizations and groups in similar manners.  There are individual fields for first, middle and last name as well as suffix, prefix and numerous other fields that describe an agent (https://urldefense.proofpoint.com/v2/url?u=http-3A__specify6.specifysoftware.org_schema_Agent.html&d=AwIFAw&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=CLFZJ3fvGSmDp7xK1dNZfh6uGV_h-8NVlo3fXNoRNzI&m=scQ9VWJ0yhPAN2vLkPCcCpIeYj3W-S910kebKAh2g9E&s=rOvUJVPVPYz7Q9vpUsNg0Ds8JU7Sat2fH8ZPJxJanFU&e= ).  There is also the facility to add attachments (images etc.) to an agent record and a variant table that allows synonymy across the agent table (Andy Bentley, Andrew Bentley etc.).  All fields in the database that refer to people - collectors, determiners, catalogers, loaners, gifters etc. - all link to this single agent table which allows for a single entry of a person's name to be linked to all these individual tables.  Multiple agents can be entered for a number of these relationships.

We too do not have any external linkage to an agent reference table.  The only group that has such a thing that I am aware of is herbaria but it would be a great idea to have an external reference table of agent names that could be used to resolve issues of synonymy and ambiguity - of which there are probably many.  Some of this ambiguity is impossible to resolve as there is simply not enough data to validate the agent in question but there are others that could be fairly easily "repaired".  Aggregators I think could play a vital role in this kind of endeavor either for individual disciplines or for the enterprise as a whole.  I for one would love to see something like this developed for fishes - along with an indication of where the fields notes are for that person which would be even more useful for numerous reasons.

In terms of best practices, I think the general databasing best practice of only including one piece of information in each field is a good one - so that you do not have a string of information in a general collector field for instance.  However, when sharing your data using Darwin Core, the one-to-many relationships caused by this relationship requires that providers concatenate all this information into a string to comply with the single field model of Darwin Core.  This is what is causing some of these issues I fear.

Hope that helps

Andy

    A  :             A  :             A  :
 }<(((_°>.,.,.,.}<(((_°>.,.,.,.}<)))_°>
    V                V                V
Andy Bentley
Ichthyology Collection Manager
University of Kansas
Biodiversity Institute
Dyche Hall
1345 Jayhawk Boulevard
Lawrence, KS, 66045-7561
USA

Tel: (785) 864-3863
Fax: (785) 864-5335 
Email: abentley at ku.edu  
https://urldefense.proofpoint.com/v2/url?u=http-3A__ichthyology.biodiversity.ku.edu&d=AwIFAw&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=CLFZJ3fvGSmDp7xK1dNZfh6uGV_h-8NVlo3fXNoRNzI&m=scQ9VWJ0yhPAN2vLkPCcCpIeYj3W-S910kebKAh2g9E&s=LFxtVMdjN8-CagoC1aLjqMQcmn5FVHvowDFbEU_bJfc&e= 

SPNHC President
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.spnhc.org&d=AwIFAw&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=CLFZJ3fvGSmDp7xK1dNZfh6uGV_h-8NVlo3fXNoRNzI&m=scQ9VWJ0yhPAN2vLkPCcCpIeYj3W-S910kebKAh2g9E&s=kuJOTU1zVOIbasMNSJnER6EgSlicpTzT40Fj6Jiud0Y&e= 

                           :                 :    
    A  :             A  :             A  :
 }<(((_°>.,.,.,.}<(((_°>.,.,.,.}<)))_°>
    V                V                V

-----Original Message-----
From: nhcoll-l-bounces at mailman.yale.edu [mailto:nhcoll-l-bounces at mailman.yale.edu] On Behalf Of Shorthouse, David
Sent: Thursday, February 18, 2016 8:47 AM
To: nhcoll-l at mailman.yale.edu
Subject: [Nhcoll-l] collector & determiner identities

All,

For the past several months, I have been experimenting with the reconciliation of collector and determiner names in digitized specimen data, using the Canadensys network of aggregated records as a small case study.

As you may well know, the Darwin Core terms recordedBy, identifiedBy, and indeed scientificNameAuthorship contain people names. To my knowledge, no one has tried to reveal the human effort and the implicit social networks using content in these terms, perhaps because the expected content and format of these terms is so under-specified.
Maiden and married names, nicknames, variously abbreviated given names, etc. are but a few examples that make ad-hoc reconciliation heuristics & algorithms very difficult to do well. You can explorer my first, albeit naive experiments at https://urldefense.proofpoint.com/v2/url?u=http-3A__collector.shorthouse.net&d=AwIBaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=CLFZJ3fvGSmDp7xK1dNZfh6uGV_h-8NVlo3fXNoRNzI&m=l4-dQ15l7SFEeJeMGIcnWDPyDO7p7IhxNa2gtk34qho&s=mZIjOrs0Vcy44oRHHM8RA14ALH6lG441BSzX1tdtxkI&e= .
There's potential here, but the next logical step is to lift this up to something like a Darwin Core extension such that data managers at the source have a mechanism to unambiguously link & share each of the one-to-many, specimen-to-name pairs to human identity.

I am writing to inquire if anyone knows of any best practices guides on how museum staff *ought* to record the names of collectors, determiners, and other agents. Do any of you have an *agents* table in your database? Have you attempted to link people names in your specimen databases to their unambiguous identities and, by extension to their scientific outputs like datasets & papers published? Does anyone yet record ORCIDs, https://urldefense.proofpoint.com/v2/url?u=http-3A__orcid.org_&d=AwIBaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=CLFZJ3fvGSmDp7xK1dNZfh6uGV_h-8NVlo3fXNoRNzI&m=l4-dQ15l7SFEeJeMGIcnWDPyDO7p7IhxNa2gtk34qho&s=xTk1sgn3JolA6nk4mkV9oxQRdpMuKhWdddwtdnuA8tU&e=  for this purpose? Last, does the ordering of determiner or collector names on labels contain any semantic meaning as it does for papers? That's something I have not yet considered & quite frankly scares me if this is important.

Hope this generates some discussion,

David P. Shorthouse
_______________________________________________
Nhcoll-l mailing list
Nhcoll-l at mailman.yale.edu
http://mailman.yale.edu/mailman/listinfo/nhcoll-l

_______________________________________________
NHCOLL-L is brought to you by the Society for the Preservation of Natural History Collections (SPNHC), an international society whose mission is to improve the preservation, conservation and management of natural history collections to ensure their continuing value to society. See https://urldefense.proofpoint.com/v2/url?u=http-3A__www.spnhc.org&d=AwIFAw&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=CLFZJ3fvGSmDp7xK1dNZfh6uGV_h-8NVlo3fXNoRNzI&m=scQ9VWJ0yhPAN2vLkPCcCpIeYj3W-S910kebKAh2g9E&s=kuJOTU1zVOIbasMNSJnER6EgSlicpTzT40Fj6Jiud0Y&e=  for membership information.
Advertising on NH-COLL-L is inappropriate.


More information about the Nhcoll-l mailing list