[Yulcat-l] Report on the Cataloger's Desktop problem 29-30 Sept.

Steven Arakawa steven.arakawa at yale.edu
Fri Oct 14 09:21:34 EDT 2005


Here's some information on the Cataloger's Desktop outage Sept. 29th. Note 
that should an outage occur in the future, if you are not on the Desktop 
list, you maystill be able to get status information from the URL given at 
the end of Bruce Johnson's note. Since I'm on the Desktop list, I'll also 
forward any announcements that come in if I'm at my desk. I was out of town 
the 29th but it looks like there would have been nothing to forward then 
with LC e-mail service out of commission. In the future, if there is no 
information from the CDS website or from the list, either contact me or 
Pedro Soto and one of us will get in touch with CDS Technical Support.

Steven Arakawa

Date:         Tue, 11 Oct 2005 11:16:26 -0400
Reply-To: "Cataloger's Desktop Forum" <DESKTOP at loc.gov>
Sender: "Cataloger's Desktop Forum" <DESKTOP at loc.gov>
From: Bruce C Johnson <bjoh at loc.gov>
Subject: Report on September 29-30 service interruption
Comments: To: desktop at loc.gov
To: DESKTOP at sun8.LOC.GOV
Precedence: list

Colleagues:

    A week ago on Thursday, September 29th at approximately 10:15 AM 
eastern time, service from the web version of Cataloger's Desktop was 
interrupted.  Service was partially restored at 9 PM eastern time that 
evening.  Full service was not restored until the following morning.

    A number of events concurrently happened that caused the problem and 
inhibited our ability to keep you up-to-date about the product's 
status.  The root cause was a disk drive failure at our contractor's ISP 
(through which Cataloger's Desktop is distributed over the web).  The drive 
failure bombarded the ISP's name router, which eventually caused the router 
to fail. Because the ISP had not anticipated this problem they were not 
immediately prepared to replace the name router, which delayed 
repairs.  Unfortunately the ISP also failed to communicate with our 
contractor until our contractor diagnosed the problem, at which time the 
ISP admitted that they were the cause of the problem.  Once the name router 
was replaced, service was restored.

    At the same time that this was happening, LC was experiencing a problem 
with its email system.  Most messages were not being received, and none 
were making it out of the Capitol Hill complex for several hours.  Our 
contractor's first email message to us about the problem didn't arrive 
until a full six hours after the problem began.  In the meantime our 
contractor began telephonically updating us with what limited information 
he had, but lack of hard information and over-reliance on our part on email 
and Listserv technology combined to create a situation where you were left 
not knowing what was happening and what was being done about it.

    We had an extended debriefing this morning with our contractor to 
identify causes and solutions and here is what is being done to resolve 
this problem:

1.  Our contractor's ISP now has a fully configured back-up name router 
that they can put into production should the causitive problem recur.  They 
have also implemented software that will allow them to track the name 
router's performance and reliability, and they should be able to detect 
this sort of problem nearly the instant that it happens.

2.  Our contractor's ISP has been made to understand that it is essential 
that they do a better job of keeping our contractor informed about system 
performance.  That way, should something of this sort happen again, the 
contractor will know what is causing the problem and what is being done to 
resolve it.

3.  Our contractor now has a fully configured name router that can 
implemented should their ISP's router fail.

4.  Our contractor will both email and call CDS to alert us as soon as 
there is a service interruption of any sort, and will update us as often as 
they can regarding the progress for resolving the interruption.

5.  CDS will communicate system interruption information both by way of the 
Desktop discussion list ("Listserv"), as well as through a notice on the 
Cataloger's Desktop page on the CDS website <www.loc.gov/cds/desktop/>.

    Should you experience a system interruption in the future, please 
consult the Cataloger's Desktop page on the CDS website first, as well as 
looking in your email inbox.  If you find nothing in either place, please 
contract CDS Technical Support at  desktop-info at loc.gov  or 202-707-1260.

    We truly regret any inconvenience the service interruption of last 
month may have caused and hope that through these changes we will avoid 
future interruptions, and keep you better informed.  Please let us know if 
you have any questions or concerns about this.

Thank you very much for your support,

Bruce

Bruce Chr. Johnson
President-Elect, Association for Library Collections & Technical Services 
(ALCTS)
a division of the American Library Association

and

Library of Congress
Cataloging Distribution Service
Washington, DC 20540-4911 USA

202-707-1652 (voice)  bjoh at loc.gov
202-707-3959 (fax)

----------------------------------------------------------
Steven Arakawa
Catalog Librarian for Training & Documentation
Catalog Dept. Sterling Memorial Library. Yale University.
P.O. Box 208240 New Haven, CT 06520-8240
(203)432-8286 steven.arakawa at yale.edu
    




More information about the Yulcat-l mailing list