[Nhcoll-l] global unique identifiers and natural history collections

Robert Guralnick Robert.Guralnick at colorado.edu
Sun Oct 14 22:37:19 EDT 2012


  Couple quick thoughts, Jim:  1)  Yes I am a fan of opacity - simply
a unique identifier.  2)  DOIs work now.  We don't have to wait for a
magical solution to appear.  3) Persistence is more than the unique
identifier itself, its also about the services, so uuids are no
guarantee of anything and do not meet our guidelines we mention in the
blog post --- <10 year persistence 3)  More than that, the solutions
we are considering from the California Digital LIbrary, EZIDs, (which
utilize DOIs and ARKs) were built around data life cycle management
needs, and there are very interesting, new approaches being developed
that aren't simply "unique id and be done".  There are ways to point
to updating content, to related content, to even embed metadata.  We
shouldn't reinvent solutions and we should use well understood, well
used, and pragmatic solutions, now.

Best, Rob


On Sun, Oct 14, 2012 at 8:18 PM, Jim Croft <jim.croft at gmail.com> wrote:
> Ah, resolvability. This is where the LSID thing came unstuck in an
> attempt to embed human readability and resolvability into the
> identifier, to the detriment of the basic principle of identifier
> opacity.
>
> So Rob, are you after a unique human readable resolvable identifier,
> with extra embedded wholesome goodness, or just a unique identifier
> that can be used for machine to machine communication? If the latter,
> why not just grab a random UUID and take the risk of hitting a
> duplicate sometime in the next millenium?  Persistent, unique (well,
> more or less), unambiguous, secure, butt-ugly and unintelligible to
> humans, free, open, organizationally agnostic, non-proprietary and
> importantly, a standard in computing circles.
>
> If you want resolvability, it is not hard to imagine a googlesque
> megaharvester building the mother of all indexes to tell us where
> stuff is.
>
> If you want pretty and human readability, well, that is what metadata
> is for...  :)
>
> jim
>
> On Mon, Oct 15, 2012 at 11:36 AM, Robert Guralnick
> <Robert.Guralnick at colorado.edu> wrote:
>>   Hi all --- Others did a great job of covering the problem with using
>> a triple of institution code, collection code and collection number.
>>  The issue with resolvability and persistence are really important.
>> When you get a unique identifier for a record, you want to be able to
>> _do_ something with it.   Something really simple.  Digital Object
>> Identifiers are now minted for a couple different purposes.  They are
>> associated with published papers, and they are associated with
>> datasets in repositories such as Dryad.  For example, here is one:
>> 10.3897/zookeys.209.3699 that resolves to a paper by Vince Smith and
>> Vladimir Blagoderov titled "Bringing collections out of the dark").
>> If you go here:  http://dx.doi.org/ and enter that DOI, voila!  You
>> get back to the paper.
>>
>> What is handy is DOIs associated with published journal articles are
>> now incorporated into electronic "references cited" sections, so you
>> can immediately go from a cited reference to the document at the
>> journal home.  There are similar use cases we see in the short term
>> and long term where immediately resolving a DOI to a specimen or its
>> specimen record would be _very_ handy indeed.  All that is needed is
>> that DOI placed in a specimen record.  It doesn't replace any other
>> information!  It just rides along with the other data you have in a
>> database.
>>
>> What is also great about DOIs is that they are probably a long term
>> solution.  The publishing industry uses them and the services for
>> resolving are there.   Although it has taken me a long time to really
>> get (or get into) this whole business of unique identifiers in the
>> context of the World Wide Web, they are beginning to really become
>> pervasive as they get more and more often attached to datasets, and
>> ultimately, I think, to data records.
>>
>> So what still might be confusing is what is an EZID versus a DOI or
>> other identifiers.  Simply put, there is a value to work with service
>> providers that have expertise with global unique identifiers.  EZIDs
>> help with redistribution of DOIs (and also another form identifier
>> called ARKs that have some useful properties).    Those service
>> providers might be closer to the community and be able to help work
>> out some issues for community X that are different from community Y.
>> Documents aren't datasets or data records, so what works for
>> publishing journal articles doesn't necessarily work for us in the
>> collections community.  The value of working with a chain of providers
>> is that maybe we can leverage the good parts services while tailoring
>> how it works to be community specific.  At least that is the hope.
>>
>> I am still working out  how best to explain a lot of this, but
>> appreciate the great followups and excellent questions.  I feel like
>> we might get someplace if discussions keep happening.  Thanks for
>> taking the time to look stuff over and sending along great questions.
>>
>> Best, Rob
>> https://sites.google.com/site/robgur/
>>
>> On Sun, Oct 14, 2012 at 4:20 PM, Cellinese,Nico
>> <ncellinese at flmnh.ufl.edu> wrote:
>>> Indeed, uniqueness is crucial! Additionally, IDs need to be persistent and
>>> resolvable.  None of these can be guaranteed by self-minted GUIDs that are
>>> created using institution acronyms.
>>>
>>> Nico
>>>
>>>
>>> On Oct 14, 2012, at 6:09 PM, <CSTURMJR at pitt.edu>
>>>  <CSTURMJR at pitt.edu> wrote:
>>>
>>> Curtis,
>>>
>>> One problem that comes to mind is CMNH as an identifier.
>>> I have seen this used for:
>>> Carnegie Museum of Natural History ( my preference!)
>>> Cleveland Museum of Natural History
>>> Cincinnati Museum of Natural History
>>>
>>> It could also be used for:
>>> Colorado (University) Museum of Natural History
>>> Canadian Museum of Natural History.
>>>
>>> Thus, one would have to standardize museum acronyms.
>>>
>>> <font face="Default Sans Serif,Verdana,Arial,Helvetica,sans-serif"
>>>
>>> size="2"><div>Forgive my ignorance, as I'm new to the collections world,
>>>
>>> but could someone please provide more detail about what you are talking
>>>
>>> about exactly? What is wrong with the use of museum acronyms followed by
>>>
>>> numbers? Or...am I missing something? Aren't these "global unique
>>>
>>> identifiers"? What are the drawbacks to using these in the traditional
>>>
>>> manner? Also, how feasible would it be for all the collections to
>>>
>>> essentially renumber their entire collections to participate in this new
>>>
>>> system? Please help me understand what this discussion is
>>>
>>> about.</div><div><br></div><div>Thanks!</div><div><br></div><div>Curtis<br><br>______________________________<br><br>Curtis
>>>
>>> J. Schmidt<br>Zoological Collections Manager<br>Sternberg Museum of
>>>
>>> Natural History<br>Fort Hays State University <br>3000 Sternberg
>>>
>>> Drive<br>Hays, KS  67601<br>(785) 628-5504 (collections)<br>(785) 650-2447
>>>
>>> (cell)<br>______________________________</div><br><br><font
>>>
>>> color="#990099"><span><a class="smarterwiki-linkify"
>>>
>>> href="mailto:-----nhcoll-l-bounces at mailman.yale.edu">-----nhcoll-l-bounces at mailman.yale.edu</a>
>>>
>>> wrote: -----</span></font><div style="padding-left:5px;"><div
>>>
>>> style="padding-right:0px;padding-left:5px;border-left:solid black
>>>
>>> 2px;"><span>To: "Bentley, Andrew Charles" &lt;<a
>>>
>>> class="smarterwiki-linkify"
>>>
>>> href="mailto:abentley at ku.edu">abentley at ku.edu</a>&gt;</span><br>From:
>>>
>>> Robert Guralnick <robert.guralnick at colorado.edu><br><span>Sent by: <a
>>>
>>> class="smarterwiki-linkify"
>>>
>>> href="mailto:nhcoll-l-bounces at mailman.yale.edu">nhcoll-l-bounces at mailman.yale.edu</a></span><br>Date:
>>>
>>> 10/14/2012 01:33PM<br><span>Cc: <a class="smarterwiki-linkify"
>>>
>>> href="mailto:tomc at cs.uoregon.edu,">tomc at cs.uoregon.edu,</a> "NH-COLL
>>>
>>> listserv <a class="smarterwiki-linkify"
>>>
>>> href="mailto:\(nhcoll-l at mailman.yale.edu\)">\(nhcoll-l at mailman.yale.edu\)</a>"
>>>
>>> &lt;<a class="smarterwiki-linkify"
>>>
>>> href="mailto:nhcoll-l at mailman.yale.edu">nhcoll-l at mailman.yale.edu</a>&gt;,
>>>
>>> John Deck &lt;<a class="smarterwiki-linkify"
>>>
>>> href="mailto:jdeck at berkeley.edu">jdeck at berkeley.edu</a>&gt;, Nico
>>>
>>> Cellinese &lt;<a class="smarterwiki-linkify"
>>>
>>> href="mailto:ncellinese at flmnh.ufl.edu">ncellinese at flmnh.ufl.edu</a>&gt;</span><br>Subject:
>>>
>>> Re: %5
>>>
>>>
>>>
>>> <><><><><><><><><><><><><><><><><><><><><><><><>
>>>
>>> Nico Cellinese, Ph.D.
>>> Assistant Curator, Botany & Informatics
>>> Joint Assistant Professor, Department of Biology
>>>
>>> Florida Museum of Natural History
>>> University of Florida
>>> 354 Dickinson Hall, PO Box 117800
>>> Gainesville, FL 32611-7800, U.S.A.
>>> Tel. 352-273-1979
>>> Fax 352-846-1861
>>> http://cellinese.blogspot.com/
>>>
>>>
>>>
>> _______________________________________________
>> Nhcoll-l mailing list
>> Nhcoll-l at mailman.yale.edu
>> http://mailman.yale.edu/mailman/listinfo/nhcoll-l
>
>
>
> --
> _________________
> Jim Croft ~ jim.croft at gmail.com ~ +61-2-62509499 ~ http://about.me/jrc
> 'Without the freedom to criticize, there is no true praise.
> - Pierre Beaumarchais
> 'Whenever you find yourself on the side of the majority, it's time to
> pause and reflect.'
> - Mark Twain
> 'A civilized society is one which tolerates eccentricity to the point
> of doubtful sanity.'
>  - Robert Frost


More information about the Nhcoll-l mailing list