5C | DIGITIZATION

TONI M. KISER

THE DIGITIZATION OF museum archives and objects draws on almost all aspects of registration and collections management. From the physical handling and preparation of materials to the catalog records, meta-data storage, and the preservation files captured, the whole process encompasses a wide swath of the role of the registrar. Digitization will have a different face in different institutions. Access and outreach are often the main goals for digitization efforts, but certainly aspects of preservation, documentation, convenience, and condition play a role as well. Standards for digitization vary from one institution to another as do the methods of capture and the types of materials digitized. As with many aspects of technology, by the time one explanation is written and published standards have changed, been upgraded, or the media itself has become outmoded, so this chapter will provide general guidance on the digitization process and resources to offer guidance rather than how-to instructions.

THE DIGITIZATION PLAN

The key part of any project, digitization or otherwise, is a plan. Plans for digitization projects can be enormous in scale such as that of a collaborative effort by Duke, North Carolina State, UNC-Chapel Hill, and North Carolina Central university libraries, which digitized more than 360,000 documents related to the Long Civil Rights Movement in North Carolina,1 or the American Museum of Natural History project that has scanned thousands of pages of scientists’ and collectors’ field notes.2 Or they can be as small in scale as a single photograph, done at the urgent behest of a curator or director. Most of us are going to find ourselves somewhere in the middle.

Goals

Setting the goals of the plan is key. Goals need to take into account the many users of the museum, its collection, the services provided, and the needs to be met. Administrators’ goals, curators’ goals, and those of education and collections departments will differ in some, if not many, ways. Goals often fall into two categories within digitization plans: those that are more quantitative and those that are more qualitative or strategic. Defining the goals and the related outcomes sets the tone for the plan overall. Important to also keep in mind is that the goals of today may not be the goals of tomorrow. The future for how digitized images will be used is unpredictable. The key is to keep the methodology open and restrictions low so that future use is not impeded. Setting the metrics for how these goals will be evaluated should also be included in the plan. Upticks in website traffic or more research requests may be metrics, as will numbers of captured objects and rates of digitization.

In 2010 the Smithsonian Institution published “Creating a Digital Smithsonian: Digitization Strategic Plan,”3 outlining a pan-institutional set of strategic goals with clear objectives. Using this type of format—one many administrators and boards are comfortable with—aides in clear and thoughtful decisions about digitization plans. From the overarching strategic goals, the quantitative goals can (usually) be extracted.

Common Goals of Digitization:

  • Broaden access to collections through publication on the internet and intranet;
  • Lessen handling of objects and archives by having digital surrogates easily available;
  • Preserve obsolete media;
  • Meet the legal requirements related to the Native American Graves Protection and Repatriation Act (NAGPRA), estates, or state or local laws;
  • Strengthen visitor engagement; and
  • Speed research initiatives.

The key to keep in mind for the goals of the digitization plan is, of course, the mission and vision of the institution. The digitization plan may have a mission and vision unto itself, but ensuring alignment with those of the institution ensures buy-in from boards, administrators, staff, and funders. In some instances, museums may have the benefit of a collections survey or assessment from which to draw digitization goals. These assessments are helpful in putting together a plan but not always necessary. Often registrars, collection managers, archivists and curators know what collections or objects are at risk, are often requested, or are a boon for researchers, etc. This institutional knowledge will help shape and define the goals of a digitization plan.

Selection

Institutional knowledge of collection use or a full survey or assessment will lead the charge in the selection of objects, archival documents, media, and so on to be scanned. The Northeast Document Conservation Center (NDCC) notes that, “at base, selection for digitization and preservation derives from the mission of the institution, and every institution should have a selection process in place to evaluate materials within that context and determine when digital conversion is most appropriate.”4 Developing selection criteria will help to further the selection process. Every institution will have criteria special to its own mission, collections, and digitization priorities. There are, however, a few general questions all digitization selection criteria should cover:

  • Should the object or document be digitized? Does the collection have intrinsic value that will be worth the cost and effort of digitization? Is there enough audience demand?
  • May the object or document be digitized? Are all the needed permissions and rights in place for digitization?
  • Can the object or document be digitized? Is its physical state sound enough to withstand the handling involved in digitization? Is the proper meta-data in place? Does the museum have the capacity to care for the digital files that will be created?

Within these criteria, each institution will have to set the standards for what makes something a good candidate for digitization. A collection of oral histories recorded on cassette tapes may not make the cut for one museum but may for another if they are at the heart of that museum’s mission. Key to the selection is to be open, inclusive, and accountable.5 By allowing curators, educators, registrars, collection managers, users, and even administrators and board members a voice in the selection process an inclusive environment will be created. However, a small selection panel or team will likely be needed to keep the process moving and to make final decisions.

Setting Standards of Digitization

An integral part of a digitization plan is the standards by which various media will be digitized. The National Archives, the Smithsonian Institution, and the Northeast Document Conservation Center along with many others have standards for capture as well as file types that are a great basis for any institution to use as a guide or model. The main components to keep in mind include the resolution of the capture (dpi for scans, bit rates for audiovisual material, ISO, etc.), the type of file to be saved (JPEG, TIFF, WAV), and how those files will be stored for both digital preservation and digital access (for more information on digital asset management see CHAPTER 4D, “Digital Asset Management Systems”).

Capture standards and file types are part of the changing landscape of technology, which are often outdated as soon as printed. Resources listed are places to begin current research on standards and file types. Capture standards will also need to include the use of ruler bars, color standards, and other aids to create true to life digital surrogates. As the use of three-dimensional (3-D) scanners and printers increases in the museum field the standards for the capture of 3-D objects in addition to, or in place of photography, continue to emerge. Time-based media works, contemporary artworks, and other audiovisual material age rapidly and the upgrading and changing of file types will also need to be addressed (see CHAPTERS 3E, “Documenting Contemporary Art” and 3G, “Managing Digital Art” for more information).

Preservation master files (or archival master files), production master files, and access files also need to be addressed as part of the standards in the digitization plan. Preservation master files will be used to create the production files and the access files most users will need. Preservation masters are created at the highest capture specifications. Typically, this serves several purposes including satisfying longer-term preservation needs and can serve as a surrogate for the original if it is destroyed or the file is corrupted.6 Preservation master files hold metadata about the images and items captured. The Online Computer Library Center (OCLC) created the “Final Report on Preservation Metadata for Digital Master Files” in 1998, and it remains a good standard for the metadata to capture.7 Preservation masters are meant to be held for the long term and maintained in a secure storage environment.

Researchers, curators, or publishing requests for high-resolution files will be fulfilled using the production master files. These files will be used to create any derivative files needed and may be edited in various ways (e.g., stitching together a large map or a panoramic photograph). Access copies are lower resolution but still of good quality and often are used in the online publication of digitized material. The digitization plan will need to outline each of these file types, the standards needed, the long-term preservation plan, and the who, how, and when of access to each.

File naming is another key element of setting standards for the digitization plan. From 3-D objects to moving images each material type will need a standard way that each file is named. This aides in organization of digital files, the retrieval of those files, and the association of those files with the object or image captured. Accession numbers or catalog numbers, being the most common unique identifier for museum collections, will often be incorporated into the file name. However, in large rapid image capture digitization projects a nondescriptive, or numerical string, may be used as the file name. The National Archives offers advice on establishing a file naming scheme.8

Timeline

“How long is all this scanning going to take?” is not an easy question to answer. The digitization plan itself will account for overall goals and timelines but may be broken up into phases or discreet projects to account for the overall plan. The digitization plan provides the overview that smaller digitization projects and phases will be developed from. A good thing to remember is that any digitization plan, much like collection management policies, needs to be regularly reviewed, not only because standards change, but also so do priorities.

However, much of “How long is this going to take?!” will depend on three major factors—funding, staff time, and tools and equipment. What the digitization plan will need to account for in relation to timelines is all of these factors. Someone once told me, “There are three things you can have—good, fast, and cheap. But you can only have two of the three.” These factors come together in much the same way. If you want it cheap and fast, then it will not be very good. If you want it good and cheap, then it will not be very fast. The digitization plan will need to account for each of these aspects, and note that various media types will require different levels of funding, staff time, and equipment.

Funding

Determining the cost of digitization may seem like grasping at straws at times. Pinning down equipment, obsolescence, staff time, and storage of the captured images can all quickly feel overwhelming. The Digitization Cost Calculator from the Council on Library and Information Resources (CLIR) can be helpful in guiding how much a particular project will cost.9 Grants, directed gifts, general operating funds, or a combination of all will need to be dedicated to digitization projects. Stanford University Libraries states that “digitization costs for a particular project will vary greatly, depending on many factors such as the volume of content to be digitized as well as the nature of the content and the intended use.… Content format and fragility may affect handling requirements as well as the type of digitization equipment that can be applied, resulting in significant differences in terms of throughput from one project to the next.”10

Tools and Equipment

Flatbed scanners, overhead scanners, camera, lights, tape decks, computers, software… the list can seem endless. Each project will have a set of equipment and tools needed for the material type being digitized. Archival materials such as documents, photographs, or maps may all be done on a flatbed scanner, but books, scrapbooks, and other bound material will require an overhead scanner. Three-dimensional objects are increasingly being scanned as well, sometimes using a flatbed scanner (e.g., for herbarium sheets11) but more often photographed with digital SLR cameras, studio lights, and backdrops. Microfilm, cassette tapes, VHS, and other audiovisual materials require specialized equipment to capture high resolution images. Most daily-use desktop scanners or analog-to-digital media converters are not able to provide the high resolutions needed to capture objects in a digitization project.

Each type of equipment will have its own software and hardware requirements. Other tools such as book supports, color bars, rulers, mannequins, and supports for 3-D objects must be acquired for the digitization project. Establishing a digital lab where tools and equipment specific to digitization is set up is ideal. The digital lab can be treated as an extension of the storage area to meet proper heating, ventilation, and air-conditioning (HVAC), security, fire suppression, and other established guidelines for collections care. The design of a digital lab will vary by institutional space allocated, the materials being digitized, and the equipment needed. The “Library of Congress Digital Scholars Lab Pilot Project Report” provides recommendations for digital lab design.12

The digitization plan must account for the inevitable obsolescence of equipment, particularly computers and software. Although discreet projects within an overall plan may not take decades, large-scale and long-term digitization plans need to account for those years when equipment and other tools will become outdated. This will closely tie into funding and overall cost projections for digitization.

Capacity of Staff

It is unfortunate that staffing for digitization is often overlooked as one of the most crucial elements of plans. Often, there is no funding or thought to provide staffing just for digitization; rather, institutions will simply add it on to the duties of regis trars, collection managers, curators, and others in the collections or curatorial departments. Often on project overload, adding additional duties related to digitization can be overwhelming for smaller staffs and museums. Collection preparation, data preparation, physically retrieving and manipulating objects, tracking movements, tracking workflows, and then the return of objects, scans, and data are all time-consuming, meticulous tasks. Registrars and collection managers will need to be involved in all these steps, even if they are not directly tasked with the capturing of images. Dedicated staff for capturing images is ideal but may not be practical for all institutions. Training is key and must be based on the type of material to be imaged. Proper handling and support as well as the safety of the objects and documents is paramount. Registrars and collection managers are vital to providing the rules and guidance for this part of the process.

The rate of capture determines much of the trifecta of funding, equipment, and staff. Whether or not to capture in house or to outsource the capture process is a major consideration. Rates of capture vary from object type to object type as do the rates of speed with which equipment can capture a particular item (e.g., imaging works on paper on a flatbed scanner is faster than using a camera to image frogs preserved in alcohol). The digitization plan should factor in several aspects when deciding to capture in house or to outsource. Considerations include the size of the particular project, how complicated the objects are to capture, equipment needed, and experience within the organization. The Sustainable Heritage Network outlines the opportunities and challenges for each and also offers additional resources when considering whether to keep projects in house or outsource them.13

COLLECTION PREPARATION

Whether in house or outsourced, the objects and documents to be digitized will all have to be properly prepared for capture. In the case of paper materials this can involve the removal of fasteners, the flattening of curled photographs, or full conservation treatments to stabilize a book or manuscript. As part of the physical preparation process notes to scanners can be made that include handling instructions or draw attention to conditions that may alter the normal scanning process. In some cases, digitization projects will allow collections staff an opportunity to improve and upgrade the storage and housing of collections.14 3-D objects may also need conservation or mounts created to support them while being photographed. Similar preparations for natural history materials, audiovisual, microfilm, and other formats will need to be made.

DATA PREPARATION

Hand in hand with the physical preparation will be the evaluation and preparation of the data associated with the objects. Most museums will leverage the use of data from their collection management system to analyze the data they have about their collections. Cataloging may not have been consistent, or the use of particular terms or fields may have shifted as cataloging standards within an institution changed. The digitization plan should establish the data standards for each material type. How the data will be used, displayed, and searched should be considered as part of the standards. In some cases that data may get filled in after capture, before capture, or as part of the capture process.15 Again, material type, time, staffing and method of capture will be factors in determining how the process of data preparation is handled.

THE DIGITIZATION WORKFLOW

The physical preparation and data preparation of the selected collections will have to be established in the overall workflow of the digitization plan. Depending on the project the workflow may vary, but there are a few key steps to keep in mind when it comes to digitization. First, the selection of collections or objects based on the criteria set in the plan. Second should be the physical assessment of the objects to ensure that they can be safely digitized and considerations for any conservation treatments made. Third, any legal and ethical issues with digitization should be considered. Fourth is preparing the data that will accompany the objects through the digitization process.

The workflow will need to account for both the digital surrogate created and the object itself, once digitized. The return of physical objects to proper storage locations is one element. The data created during or after the capture process needs to be properly entered into the database. The images created need to be properly named based on the file-naming conventions established with preservation master files saved. Then, any postprocessing that is necessary, such as adding metadata to images, cropping, or color correcting must take place.

Quality control (QC) or quality assurance (QA) should take place in all the steps of the workflow, but will be a crucial step when the capture process begins. Saving files with the proper file-naming convention, the structure and order of the folders in which those files are saved, and the quality of the images will all need to be checked regularly. This will happen more often in the beginning as new projects get started, but it can be done on a regular interval or spot checked as projects progress.16

The nature of a particular project will dictate the workflow once digital files are deemed of sufficient quality. Incorporating the data into the collections management system (CMS) or a digital asset management system (DAMS), publication to the web, or sharing with other institutions or internal clients are all possible next steps. The digitization plan should account for the storage of master preservation files, and access to production masters, access files, and the workflow should follow those guidelines.

EVALUATION OF GOALS

The process of digitization often feeds on itself; making collections available through digital access may increase demand for more materials to be made accessible. The influx of more standardized data, images, and physical reprocessing (or first-time processing) of collections may increase expectations from administrators, researchers, curators, and registrars and collection managers. The goals set forth in the digitization plan need to be evaluated using both the quantitative and qualitative metrics which were established. Were the goals met? Were there impediments to reaching the goals? Over time the goals should be reevaluated to ensure that they still align with the mission and vision of the museum and the plan itself. The plan may start as one that is about access but may over time evolve into supporting scholarly research or licensing opportunities. The goals within the plan will need to be reevaluated with new metrics established as digitization projects move forward.

Risks to Collections

The risks to objects must be a factor when considering digitization, and the plan should address those risks and the benefits of taking the risks. The physical stability of objects is addressed previously, but there may be instances in which because the physical state is poor, digitization is deemed the best alternative because it creates a digital surrogate. In these instances, potential loss must be taken into account. Although meticulous tracking can be in place, there is always a risk that objects could be lost or damaged during the digitization process. This can be as simple as a photograph getting placed in the wrong folder or as serious as lost shipments if outsourcing material. Objects and documents may also be damaged by inadequate supports or staff not properly trained. The workflow, standards for care and handling, and a consistent approach to providing material for digitization should mitigate these factors.

Additional risks come from the nature of the equipment used to capture the object. Light from scanners, camera lights, or other sources need to be taken into account. Although these light sources are more intense than ambient light, the duration is short so the cumulative effect on materials handled in accordance with typical digitization standards should not have severe affects. Lights often create heat, however, which is another risk factor to consider. Equipment that has been running for long hours can heat the digital lab or the glass of a scanner bed. If an object is particularly sensitive to heat it should be noted in the workflow so that care can be taken to capture an image of it before equipment or room temperatures have risen. Preparing Collections for Digitization by Anna E. Bulow and Jess Ahom provides more information on risks during digitization and how to mitigate them.17

CONCLUSION

Is digitization ever done? Probably not, unfortunately. The commitment to digitization almost always lasts beyond the scope of the initial plan. Museums need to account not only for new materials coming in, but also the migration of data from one format to another or from one media to another as standards and equipment change. The Northeast Document Conservation Center states that

Digital copies play an important preservation role as surrogates protecting fragile and valuable originals from handling while presenting their content to a vastly increased audience. A digital version may someday be the only record of an original object that deteriorates or is destroyed. But digitization is not preservation—it is simply a means of copying original materials. In creating a digital copy, the institution creates a new resource that will itself require preservation.18

GENERAL RESOURCES

Sustainable Heritage Network. Available at: http://www.sustainableheritagenetwork.org/system/files/atoms/file/2.1.06_digitizationprojectdecision-making_startingadigitizationproject_k….pdf

The Northeast Document Center. Available at: https://www.nedcc.org/free-resources/preservation-leaflets/overview.

Library of Congress. Available at: http://www.digitalpreservation.gov/.

Smithsonian Digitization Program Office. Available at: https://www.si.edu/newsdesk/digitization-program-office

Resources for Capture and File Types

The Society of American Archivists. Available at: https://www2.archivists.org/standards/external/123.

The Smithsonian Institution. Available at: https://siarchives.si.edu/what-we-do/digital-curation/digitizing-collections.

Library of Congress. Available at: http://www.digitalpreservation.gov/.

Association of Library Collections and Technical Services. Available at: http://www.ala.org/alcts/resources/preserv/minimum-digitization-capture-recommendations.

Lyrasis. Available at: https://www.lyrasis.org/Leadership/Documents/Advancing-3D-Digitization.pdf.

NOTES

1. Available at: https://blogs.library.duke.edu/bitstreams/2014/08/08/large-scale-digitization-lessons-ccc-project/.

2. Available at: https://www.biodiversitylibrary.org/creator/197368#/titles.

3. Available at: https://www.si.edu/content/pdf/about/2010_SI_Digitization_Plan.pdf.

4. Available at: https://www.nedcc.org/free-resources/preservation-leaflets/6.-reformatting/6.6-preservation-and-selection-for-digitization.

5. Anne E. Bulow and Jess Ahmon, Preparing Collections for Digitization (London: Facet Publishing, 2011), 59.

6. Available at: https://www.carli.illinois.edu/sites/files/digital_collections/documentation/guidelines_for_images.pdf.

7. Available at: https://www.oclc.org/research/activities/digpresmetadata/report.html.

8. Available at: https://www.archives.gov/files/preservation/technical/guidelines.pdf.

9. Available at: https://www.diglib.org/an-update-to-the-digitization-cost-calculator/.

10. Available at: https://library.stanford.edu/research/digitization-services/services/pricing.

11. Available at: https://digitalarch.org/blog/2016/2/18/plant-imaging-on-a-flatbed-scanner-creating-a-digital-herbarium.

12. Available at: http://digitalpreservation.gov/meetings/dcs16/DChudnov-MGallinger_LCLabReport.pdf.

13. Available at: https://www.sustainableheritagenetwork.org/system/files/atoms/file/1.20_OutsourcingvsInHouse.pdf.

14. Bulow and Ahmon, Preparing Collections for Digitization, 122.

15. Available at: http://www.sustainableheritagenetwork.org/system/files/atoms/file/2.1.06_digitizationprojectdecision-making_startingadigitizationproject_k….pdf.

16. Available at: http://www.sustainableheritagenetwork.org/system/files/atoms/file/2.1.06_digitizationprojectdecision-making_startingadigitizationproject_k….pdf.

17. Bulow and Ahmon, Preparing Collections for Digitization.

18. Available at: https://www.nedcc.org/free-resources/preservation-leaflets/6.-reformatting/6.6-preservation-and-selection-for-digitization.