[MOA2-WP:17] Re: Penn State comments on White Paper

Merrilee Proffitt (mproffit@library.berkeley.EDU)
Tue, 28 Apr 1998 16:59:58 -0700

Susan, thanks very much for kicking off the discussion. I will be
discussing these issues with Bernie and Howard and John, and we will
respond in detail. Others should feel free to join in the discussion, and
I look forward to comments from the rest of the participants.

As for your last comment, YES, we are looking into the possibility of
having a meeting on the East Coast--I think that will be important in
making group progress.

Merrilee

On Tue, 28 Apr 1998, Susan Hamburger wrote:

> On April 23, 1998, representatives from Penn State met to discuss the
> MoA2 White Paper. We have many concerns about issues either not raised in
> the White Paper or raised but not fully addressed.
>
> 1) Our major concern is that this model does not appear applicable to
> archival
> collections. Throughout the document the authors refer to book examples
> and TEI but the kinds of materials we will be contributing to this project
> do not look nor act like books. Archival materials are not described or
> handled on an individual piece basis but rather as aggregate collections.
> We question if we are using the right framework for archival collections
> given the examples used in the White Paper.
>
> 2) This model doesn't follow user behavior. For archival patrons who rely
> on subject access to find relevant collections, parts of collections, and
> items within collections, we need to provide subject access in context
> which isn't addressed in the White Paper. Having a search mechanism point
> a user to a specific date in a minute book or diary does not help them find
> what the author has written about on that date. And this level of
> page-by-page subject analysis is way too labor intensive to provide.
>
> 3) The individual level of description required for each image is not
> scalable for a model. The time to encode the metadata for each image is
> much more labor-intensive than regular processing of archival collections.
> The White Paper does not address how much time it will take to provide
> metadata for each image at the funding level we are receiving. The size of
> the testbed should be reduced for the amount of work required and the
> concomitant reduced level of funding. The funding provided is inadequate
> for the amount of time it will take to process the collections at the
> proposed
> level of detail. The project seems too ambitious.
> The testbed should be scaled back to a workable number of images and
> instead concentrate on creating the architectural structure of the metadata
> for the objects. Scanning in thousands of images doesn't answer the
> question of whether this kind of digital archives (not library!) is
> feasible given the nature of the collections.
>
> 4) Does the software exist for the level of description expected or will
> this need to be developed?
>
> 5) There needs to be a shared template so everyone is providing the
> same type of information. Is there a template in place? Has Berkeley
> designed
> a template or will each repository have to design its own?
>
> 6) The intellectual piece of analysis needs to be addressed. At what
> level will correspondence, minute books, and diaries, for example, be
> described in the metadata? If there is to be a shared template, who will
> have
> input into the design and level of analysis?
>
> 7) Assuming the standard is 24-bit for certain images, can the Internet as
> it exists handle importing this size of data? What assumptions are we
> making about document delivery?
>
> 8) How are we going to decide among all five repositories to agree on
> metadata for each object? Will only those repositories contributing
> photographs discuss and agree on the depth of description that will be used
> for these images or will all repositories discuss and come to an agreement
> on every object?
>
> 9) Do we understand correctly that the images will reside locally but the
> metadata and search engine will be at Berkeley? In that case, is there a
> search engine sophisticated enough to do what is proposed in the White Paper?
>
> 10) Scanned images of handwritten documents are not searchable. Using OCR
> increases the cost and the technology doesn't exist yet that can accurately
> read handwriting. Barring rekeying the text of handwritten documents, how
> will the search engine seek and find the objects in context?
>
>
> Since three of the five participating institutions are on the east coast,
> it might be cost-effective to have the summer meeting in New York or
> Pennsylvania. However, we might benefit from a visit to Berkeley to
> observe the operations there.
>
>
>