[NHCOLL-L:2027] Re: Lack of voucher numbers for sequences.

Una Smith una at lanl.gov
Mon Aug 11 11:34:31 EDT 2003


A subscriber to NHColl-L wrote:
>No, all they need to do is make it easier to be able to enter this
>information and have specific fields to enter it

It already is easy to enter this information, in specific fields!


>If Genbank could make these fields available we could amend this
>to include voucher information too.

Each research community requires different custom fields.  Should
GenBank provide a different custom interface for submissions?  As
I said before, GenBank already does provide for custom fields, in
the FEATURES block, source sub-block.

GenBank already requires unique source modifiers, such as voucher
names.  However, the relevant field names and other such details
are determined by the submitter.  For examples, see 
	http://www.ncbi.nlm.nih.gov/BankIt/examples/requirements.html
On the left side of the page are links to examples for many types
of data.  The example for HIV-1 includes these source fields:
	/organism=""
	/clone=""
	/isolate=""
	/country=""
These fields are USER-defined, NOT pre-defined by GenBank;  these
fields are defined by convention among HIV researchers, who are
highly motivated to use these fields so that their sequences will
be richly annotated in the HIV Sequence Database (for one view, 
see http://hiv-web.lanl.gov/content/hiv-db/combined_search/search).
The HIV Sequence Database gets batch updates from GenBank, then
parses the GenBank updates (including any custom fields in the
FEATURES block;  this requires a lot of attention by us at LANL!)
into a database that serves the entire HIV research community.
Some HIV researchers include fields to report patient sex, patient
age, patient infection risk factor(s), etc.:  any type of data
that we now annotate in the HIV Sequence Database.  If many HIV
researchers begin to include a new type of data in their GenBank
records, we will begin to parse and annotate it into the HIV
Sequence Database.

Perhaps it is time for other research communities to consider
developing their own conventions for custom fields in GenBank
accession records.

	Una Smith

Los Alamos National Laboratory, MS K-710, Los Alamos, NM  87545


More information about the Nhcoll-l mailing list