1) Our major concern is that this model does not appear applicable to
archival
collections. Throughout the document the authors refer to book examples
and TEI but the kinds of materials we will be contributing to this project
do not look nor act like books. Archival materials are not described or
handled on an individual piece basis but rather as aggregate collections.
We question if we are using the right framework for archival collections
given the examples used in the White Paper.
2) This model doesn't follow user behavior. For archival patrons who rely
on subject access to find relevant collections, parts of collections, and
items within collections, we need to provide subject access in context
which isn't addressed in the White Paper. Having a search mechanism point
a user to a specific date in a minute book or diary does not help them find
what the author has written about on that date. And this level of
page-by-page subject analysis is way too labor intensive to provide.
3) The individual level of description required for each image is not
scalable for a model. The time to encode the metadata for each image is
much more labor-intensive than regular processing of archival collections.
The White Paper does not address how much time it will take to provide
metadata for each image at the funding level we are receiving. The size of
the testbed should be reduced for the amount of work required and the
concomitant reduced level of funding. The funding provided is inadequate
for the amount of time it will take to process the collections at the
proposed
level of detail. The project seems too ambitious.
The testbed should be scaled back to a workable number of images and
instead concentrate on creating the architectural structure of the metadata
for the objects. Scanning in thousands of images doesn't answer the
question of whether this kind of digital archives (not library!) is
feasible given the nature of the collections.
4) Does the software exist for the level of description expected or will
this need to be developed?
5) There needs to be a shared template so everyone is providing the
same type of information. Is there a template in place? Has Berkeley
designed
a template or will each repository have to design its own?
6) The intellectual piece of analysis needs to be addressed. At what
level will correspondence, minute books, and diaries, for example, be
described in the metadata? If there is to be a shared template, who will
have
input into the design and level of analysis?
7) Assuming the standard is 24-bit for certain images, can the Internet as
it exists handle importing this size of data? What assumptions are we
making about document delivery?
8) How are we going to decide among all five repositories to agree on
metadata for each object? Will only those repositories contributing
photographs discuss and agree on the depth of description that will be used
for these images or will all repositories discuss and come to an agreement
on every object?
9) Do we understand correctly that the images will reside locally but the
metadata and search engine will be at Berkeley? In that case, is there a
search engine sophisticated enough to do what is proposed in the White Paper?
10) Scanned images of handwritten documents are not searchable. Using OCR
increases the cost and the technology doesn't exist yet that can accurately
read handwriting. Barring rekeying the text of handwritten documents, how
will the search engine seek and find the objects in context?
Since three of the five participating institutions are on the east coast,
it might be cost-effective to have the summer meeting in New York or
Pennsylvania. However, we might benefit from a visit to Berkeley to
observe the operations there.