For a separate activity I had coded up a dummy finding aid that points to
objects distributed over the web -- and is intended to highlight the
independence of the objects (with their own behaviors) representing
digital reproductions from the finding aids. If you have Panorama, you
can pick up the finding aid from
http://lcweb.loc.gov:8081/ndlint/eaddemos/demo1/demo1.sgm
and see what I mean. [I apologize for only having a straight SGML
version, but it actually makes the point about independence better.]
I wondered whether a relatively concrete example might help as an
illustration. Since the URL is in our test area, please do not disseminate
it widely. If it would be useful, I could find a spot on the production
site for it.
Caroline Arms caar@loc.gov
National Digital Library Program
&
Information Technology Services
Library of Congress
==========
Comments from various LC staff on MOA2 White Paper. Comments followed by
an asterisk were made more than once.
Some general comments
=====================
A glossary would have helped. (Method, Behavior, Tool, etc.). *
How does this relate to search or navigation of descriptive metadata (MARC
records & finding aids) and to access information for the original items? *
[I had not given people a copy of the whole proposal, and my brief
verbal introduction had obviously not conveyed well enough that this was
largely out of scope for the paper, which concentrates on what you can
do with digital objects once you have found them. However, this
reaction may suggest that material from the page 20 section on
descriptive metadata be brought to the front and "dismissed."]
Go for more generalization and keep mandatory elements to a minimum.
Tools may need to tolerate less than ideal information. [One of the keys
to the success of the web may have been the ability of browsers to
tolerate HTML that is not technically valid.]
Diagram representing graphically relationships between concepts could
help. Technical note on page 8 could benefit from a picture.
Comments on the concepts
========================
Distinction between the Service Layer and the Tools Layer not understood.
Perhaps need to relate an example of a "digital library object" to what
might today be implemented as separate "files" -- to make it clearer that
you are dealing at a logical level rather than a "physical" implementation
level. [Page 12 might be a place to touch on this.]
Not clear enough that structural metadata is the key to supporting
behaviors and to defining or distinguishing classes of objects. *
Object-oriented focus as mechanism for user-centered design is helpful,
particularly the correspondence (but distinction) between behaviors and
methods. An interface development project at LC ran into problems that
might have been resolved if this model had been understood and the
conceptual relationships between behaviors, methods, and structural
metadata been used.
Unwittingly, American Memory has actually stumbled onto a process that
corresponds to starting with the behaviors that users want to see and
moving back from there to implementation. In your model, these behaviors
must lead to the methods and to the structural metadata that has to be
captured and represented. In our case, the "users" in the design process
have been proxies for the real users (curatorial and reference staff in
special format divisions and NDLP staff from a scholarly background) and
the focus has historically been on what had to be captured at scan time to
stand a chance of supporting desired behaviors. The methods and tools to
implement those behaviors have probably not been given enough thought up
front. Recently, the need to involve the programmers building the tools
(e.g. for presenting a book-length work) before the structural metadata
(in this particular case, tagging structures in SGML) is encoded has been
recognized.
LC's experience with a very heterogeneous set of materials (and trying to
apply experience and models to other applications) leads to a wish to be
VERY GENERAL and distinctly MINIMALIST as far as classes of objects and
mandatory metadata elements are concerned. To the extent that the need to
categorize digital objects into classes has been accepted, the wish is to
make these classes independent of format and genre of original materials,
and as general as possible. As we struggle with generalizing our own
procedures (including workflow and contractor requirements as well as the
user/access end of things), we find ourselves wanting to exploit the
commonalities rather than make small distinctions. We hope you would see
the "continuous tone photograph" as an example of a much more generic
class (say, of a single still pictorial item). We have the concept of a
"page-turning" object, although we don't use the term. We use it for
pamphlets, folders of manuscript items, sheet music, etc. etc. Some of us
have found ourselves believing that audio and video can both be considered
as examples of a class of "time-based" objects.
[As I write this, I realize that this view emphasizes a distinction
between administrative and structural metadata. Clearly, the metadata
about how material was digitized would be very different for audio and
video. The tool that handled things at the level of MIME-type would
also be different. However, the metadata that described the structure
of different digital versions and sequential segments would be
essentially the same.]
Image Capture
=============
Seems biased towards pictorial images and positive prints. Misses
distinction between reproduction of the artifact and extraction of the
information (e.g. legibility), which can lead in different directions for
textual manuscript materials. Should there be more discussion of when
bitonal might be appropraite vs. grayscale, or color vs. grayscale?
Perhaps worth emphasizing in general recommendations (in addition to all
the other excellent points) that scanners from different manufacturers may
have different color or tonality biases and other characteristics.
Scanner hardware choice can be important.
Although it is undeniable that detailed information about scanning and of
manipulations used to generate derivative images is a good thing, any
implementation should allow for many elements to be optional (even if
strongly recommended).
Other detailed concerns or suggestions
======================================
LC has problems with establishing reverse linkages from content objects
to related descriptive metadata managed separately from the objects,
particularly to MARC records in the Library's catalog. This requires a
persistent identifier for the descriptive item, which hasn't seemed
feasible in LC's overall environment to date. LCCNs are not seen as
immutable by LC, particularly for the older materials where catalog
records may date from before the days of MARC. Materials where
item-level records have been created for American Memory but not
incorporated into the main Library catalog also pose problems here.