Re: [IUG] URL Checker - who does the work, how long odes it take, etc?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
- Date: Wed, 28 Jan 2009 15:13:06 -0500
- From: Martha Sanders <msanders at etal dot uri dot edu>
- Subject: Re: [IUG] URL Checker - who does the work, how long odes it take, etc?
I do the bulk of the work with the URL checker for the HELIN
Consortium, which consists of 10 academic ad 14 hospital
libraries. I work in the Consortium office supporting the libraries
in cataloging, authority control, ERM and other activities.
Over time, you learn what categories of URL's are flagged every week
as problems that are, in fact, fine. A very useful tool in the
Millennium URL Checker module is the 'URL block' which prevents
certain URL's from being checked. Here I list those servers which
ALWAYS show up on the report such as purl.gpo.gov (because they all
move to another URL) and resources going through our proxy server
that are subscribed to by HELIN member libraries, including ebook
collections (ebrary.com, hdl.handle.net for the ACLS collection,
oxfordreference.com, xreferplus.com, and so on). I currently have
about 150 entries in this file. This is, of course, not ideal since
some URL's that no longer work may be blocked from being checked, but
I was able to reduce the number of flagged URL's from over 10,000 to
a couple of hundred for me to actually work on. Since most of the
member libraries subscribe to these paid resources through a process
I manage, I'm aware of when a new resource is added or cancelled.
For us, the checker runs automatically in the early hours of Tuesday
morning. Each week, I move the file of bibliographic records
generated by this process into a review file for further review. I
sort this file a couple of different ways (by cat. date, material
type, bib. level) to see what shows up for the first time. I can
often resolve several of the ones for bibs. on order, videos, and
serials without going into the URL Checker module itself. Libraries
may create new review files from this one for their own bib. records
to check them. Some do, some don't.
Most of the questionable URL's in my report are for US government
documents, probably not surprisingly. Several of our libraries, but
not all, use Marcive to receive bib. records for their
repositories. I asked Marcive to only send us the purls if the bib.
record contains but a purl and the actual URL; the latter often changed.
About once a month I use the URL Checker module itself, where I look
most closely at bibs. with 404 (not found) and 301 (moved
permanently) errors. For those web pages where I find an exact match
for the resource cataloged, I correct the URL myself. For resources
where I cannot reasonably and easily identify the correct new URL, I
send an email to the head of cataloging for that library. That
person may consult with the selector/liaison, find the new site for
the resource themselves, or delete their holdings from OCLC and the
bib. record from the catalog. I spend about 30 minutes weekly and 2
hours once a month doing this work.
For several of the reports (moved permanently, moved temporarily, not
found, etc.) the program identifies a possible new URL. If that URL
is the correct one, you can check it off and use the program to
replace the URL in the bib. record. One issue I have is that the
bib. record itself doesn't show unless you highlight it and click on
edit or bring it up in the web; it's crucial to check the
bibliographic information before making any changes to the URL.
Currently the program only looks at URL's in the bib. record, not
attached records (item and holdings records) nor ERM resource
records. As a Functional Expert for the Innovative User Group
responsible for review enhancement requests for the URL checker, I
can say that many want III to expand the reach of this program to
URL's wherever they appear.
I hope this helps,
Martha Rice Sanders
Knowledge Management Librarian,
The HELIN Consortium
msanders at etal dot uri dot edu
At 07:28 PM 1/7/2009, you wrote:
The UC Berkeley library is half way through a Millennium implementation
and we are interested in the URL checker module, but we have some
questions. If your library uses the URL checker, we'd be curious to know
a few things about how you use it:
- Who resolves the dead links? Reference staff? Technical Services
staff? Other? It seems like it would take reference skills to hunt down
the new replacement link.
- How do you resolve them? Web searching and seeing how many you can
find? Some systematic approach? Or do you simply remove the dead link
without an attempt to replace it? I know that if there's a redirect, the
product gives you that information in the report.
- Any sense of how many you process (resolve successfully or just
delete) in a given month, as compared to the number of URLs in your
catalog? Any sense of how long this takes (ie, how many hours a month
your staff devote to this task)?
We've bought but not yet implemented the ERM product also - that comes
at the end of our implementation project. So if there are wrinkles that
pertain to ERM sites I'd be glad to know those too.
UC Berkeley Library Integrated System Manager
This message was distributed through the Innovative Users Group INNOPAC list
Public replies: INNOPAC at innopacusers dot org
Update your subscription options: