Wednesday, March 26, 2014

Digitizing photographs - End of Phase 1


           Well, we’ve almost finished the first 2000 photographs to be sent off to the University of North Texas for digitization.  It wasn’t so bad with three of us on the project.  What have I learned? First off you need to edit and re-edit your work.  Metadata and inventories are monotonous so mistakes are almost inevitable at least mine seem to be.  If you’re working in Excel you have to be especially careful because, of course, there is no spell check. That’s another reason to dislike Excel. Another is some of the automatic features. They drive me crazy. It’s fine for spreadsheets, but tough for inventories.  Just because the first two words are the same doesn’t mean all the rest an entry is.  I’ve done all of the inventories before this in Word. It has its problems too. Numbering and indenting are two of the most annoying.  All of a sudden the numbers change, indents won’t align. What’s up with that?  I try to remember to disable the automatic features but I don’t always.  Trying to do an outline is often a nightmare.  Even with that I prefer Word to Excel for inventories.

          What other things might I consider doing differently? Well it would be nice to have an expert on the collection around to help us identify individuals.  We did have a book of the history of the school and that was great and, of course, the internet is wonderful if you have enough of a name.  One area where we needed more assistance was with dates.  Most photographs were not dated, many not identified.  We had to guess dates based on clothing or the absence of computers and that sort of thing or leave the date cell blank.

          The other thing that I might consider next time is to expand the metadata.  We had no column for the type of photograph – black and white, color, slide, or whatever.  Another category could have been photograph size – 8x10, 5x7 or in metric.  That’s often helpful for a researcher.  UNT may do that. I am not that familiar with their metadata schema and how much if anything they add to what is sent to them.  I guess I need to find that out. 

           The last problem I see is an ongoing one. I have mentioned this in other posts.  How do you choose the photographs to digitize objectively? I still don’t have an answer for that one.  Portraits are the most difficult.  They are already on line in the yearbooks so if you have a date and name you can find someone’s photograph.  Often the scanned yearbook or newspaper image is not that high a quality so that may be one reason to digitize the portrait separately.  I guess what concerns me most is that not every faculty and staff at the university had photographs in the archives.  Actually it was pretty haphazard.  For example, the seventies and the nineties were fairly well represented, but not the eighties or the earlier years.  The university or any creator of a collection needs to impose more order on their collections to avoid this problem.  Like I said before identify your photographs, date them, and put them in some organizational scheme. You will make an archivist very happy.

        Off to a conference next week.

Sunday, March 23, 2014

Digitizing Photographs - Week 3


            Well we are slowly making progress with item level description (metadata) for our photograph collection. Yea!!! Our new approach of a mini-assembly line seems to be working. I’m still choosing photographs, re-housing as needed, and I started doing a little research on the individuals in the photographs. That’s making the metadata go quicker and allows my partners time to do a more in-depth research on some of the photographs.  That’s paid off because they have found name errors and were able to indicate the correct name (Jorge, not George, for example). 

            The problem as I see it is the subjective nature of picking photographs that should be digitized.  As I talked about before we have criteria for choosing what to digitize, but whether that will meet the needs of researchers we have no way of knowing.  In some respects we are doing an on line exhibit that gives a sample of what the collection contains.  At this point we have no way of determining the possible uses this collection might have – genealogists, alumni looking for friends, history researchers.  Perhaps in the future we can devise some test of what photographs are used and why – the way a museum tests to see the value of a exhibit to its patrons. 

            I do think one important consideration is the precision and accuracy of the search engine.  At first we did not choose a photograph if it had been in the campus newspaper or yearbook. The exception were the older photographs from the twenties.  We are rethinking that.  Although the photographs may be up on line unless you know exactly what publication they are in and where in that publication, the search engine is not able to find them.  For example, college catalogs have photographs of activities around campus.  Rarely do these have the individuals identified.  When we digitize the same photographs we can add names and dates and location as part of the metadata so the search engine can see it.  An example is a group photo of a singing group. In the university catalog the group has no identifying information.  Our metadata does.  It’s something to consider if you have similar situation.  If it’s up on line, but a search engine can’t find it that doesn’t help anyone.

Sunday, March 9, 2014

Getting Ready for Digitization


           Last blog post I talked about the initial processing of a large photograph collection to the file level.  The goal as I noted was to provide the university with some intellectual control over the collections in their archive. That really is the point of processing, that and helping to preserve the physical material. After the project concluded several collections were earmarked for digitization by the library if and when money became available. Well, money has become available so we are starting to prepare the photograph collection for digitization.

             As I mentioned there are at least 10,000 photographs processed to the file level, not the item level.  That doesn’t count negatives or slides.  I emphasize that because digitization requires metadata (information) about each photograph digitized.  That means we must process every file to the item level and we have only a few weeks to accomplish this. Now I had spent the better part of a summer organizing the photographs to the file level and imposing some order. My goal was to make that photographs accessible to anyone at the university who might need photographs from a particular topic like university buildings or football or faculty.  I wasn't processing for digitization.  As I mentioned before, the photographs had arrived at the library thrown into boxes, some in manila folders, some in the envelopes from the printing company, but most just simply tossed together.  I should note that a previous attempt had been made to identify the individuals in the photographs, but this had failed and those photographs were simply thrown into boxes for another move.  Since the photographs were for the most part kept by the Public Relations Department many had been used in publicity.  Others had been published in the yearbook.  The yearbooks and university newspaper are already digitized so many of the photographs are already on line.  Of course unless you search for every photograph there is really no way to know what is already on line.  Money is limited so that wouldn’t work.  What to do? Well, you compromise and do the best you can with what you have at least that is what we are doing.


Some of the more organized boxes prior to file level processing

              Since there’s not money to digitize everything we had to make decisions.  Older photographs where the image could be identified were chosen because of their importance to the early history of the university. Even if the photograph was online in yearbooks or the campus newspaper, these early photographs will still be digitized.  The rationale is that a digital image from the original photograph would be clearer than one in a yearbook picture.  Attempts are being made not to digitize duplicate pictures, but this has proved difficult because the same picture may be in multiple files.  Portraits of significant university presidents, for example, can be found in various files.  One person working on a collection might catch duplication, but with multiple people helping it is impossible. I’m not sure how to avoid duplication given time and money constraints.  This is one area where we are still addressing, especially in terms of portraits.  Even if you avoid duplicates how many different portraits of one particular person do you need? If it was a faculty member there might be a portrait for every year they taught and that may have been years.  

             At first we started with everyone taking a box.  Each individual made the decision, which photographs were to be digitized, numbered them, and provided the metadata.  Progress was slow.  Currently we are approaching the problem like an assembly line.  One person, me, goes through each box, chooses the photographs to be digitized, numbers each item, and re-houses as necessary (most of the photographs are not in their own sleeves as they should be).  The next person is in charge of entering the metadata and making the final decision of what gets digitized.  Hopefully this will better address the duplication issue and allow the proper housing of the material.  We’ll see how fast it goes.  We have also decided to divide the collection in two, that is, not try to do it all at once.  We only have money for 1500 photographs and last count we were near a thousand.  Wish us luck.

Wednesday, March 5, 2014

Dealing with the Real World: Archival Processing Compromises



                        I’ve talked about original order and provenance before and noted that  problems result when these rules are ignored. Once collections are divided provenance can be lost.  Once original order is ignored, the organizational scheme of the creator is lost. Basically trying to recreate what was changed often takes too long and may cause more problems.  You must compromise.  For archivists the goal is to expedite processing to enable accessibility and gain intellectual control over the collection.  Even if you could undo well-intentioned destruction of original order budgetary constraints and time often provide limitations.  The collection that I am working on is a case in point.  It is photographs.  The collection arrived at the library from the Public Relations Office and the Alumni Association although it is not really clear who sent what. The original creator department is not known as the records have been passed around and stored hither and yon throughout the university.  So we don’t know who collected the photographs and we don’t know the organizational structure that was used because that’s been lost over time. Some of the photographs are numbered although exactly what the numbering means is not altogether clear. Some of the photographs arrived at the library when a building on campus was being renovated.  The rest came from a retiring staff member’s office.  Unfortunately she was a saver, but not particularly well organized.  Boxes from her office had little to no organization with photographs and unrelated papers mixed together.

An example of one of the smaller boxes of miscellaneous material

                        I spent the good part of a year trying to impose some order to the records and attempt to find any underlying order that might still exist.  First step was to accept that that the original order could not always either be determined or be restored.  The next was for the library to make some decision about what they would preserve and what they could not.  As we talked about before, not everything should and can saved.  Our problems were acaerbated when the archives had to be moved again because the room where they were housed was needed as a classroom. Time was short.  Did I mention that money was also limited particularly for archival storage material?  The best we could do were a few archival sleeves, but we did have archival folders and acid-free boxes.  It was a start towards preservation and organization, but definitely a compromise.

                        The first step in dealing with a collection like this is to do an appraisal and come up with a processing plan.  That required looking into each box and trying not to become too overwhelmed. No one had gone through the photographs to weed out those that were not particularly good. Most had no identification.  Some had been used in previous publications or appear to have been.  Some were professional photographs.  Some were simply candids  - some good and some bad. Did I mention that there are over 10,000 photographs not counting negatives and slides?  As part of the plan that was developed the library staff made some decisions of what to keep and what to discard.  For example, yearbook photographs were already digitized so they were not kept nor were poor photographs (out of focus, head chopped off, poor lighting, lots of pictures of unidentified Homecoming bonfires – that sort of thing). What organization that could be determined was kept.  For example, there was an entire box labeled “Social Clubs.” This became one of the sub-series under a series called “Students.”   Examples of the series developed include the following: Buildings; Athletics; and People.  People had sub-series of Faculty, Board of Trustees, and so forth.  Each series has files, such as early buildings or current buildings under the series “Buildings”; for the series "Athletics" the files are football or baseball, etc.).  These series seemed to have been part of the underlying order as best we could determine.

At least the outside box was labeled even if the photographs weren't
               The best way to approach a mess like this is one box at a time.  At least that’s what seemed to work for us.  I think that is the approach to use when sorting through papers at home too.  It’s easy to get overwhelmed.  We had one of an  emeritus professor help us with some identifications.  He wasn’t emeritus enough to be able to identify everyone, but it helped. The next step is digitization and that means item level numbering and metadata (information) and I hope best practice collection care. Oh help! Must remember one box at a time.