4C | Computer Systems and Data Management

4C | COMPUTER SYSTEMS AND DATA MANAGEMENT

SUZANNE QUIGLEY (UPDATED AND EXPANDED BY CRISTINA LINCLAU)

COMPUTERS HAVE BECOME essential to the activities of preserving, interpreting, and accessing collections that are central to any museum’s mission. A collections management system (CMS) is any database or other software that supports museum workflows including tracking object locations, preparing exhibitions and loans, managing images and their rights, facilitating conservation, and constructing exhibition histories and provenance. Other museum departments use this centralized record system and the CMS serves as the foundation for public access to institutional collections online. To aid registrars and collection managers—who are often CMS system administrators—in planning for and getting the most out of the museum’s CMS, this section provides a comprehensive account of CMS selection, implementation, and maintenance, and briefly outlines structured data management.

COLLECTIONS MANAGEMENT SYSTEMS

System Requirements

A system consists of hardware, software, and the network on which they are used within the museum. Although these elements work in concert to automate many tasks that take hours to do manually, they are run by humans and must be carefully managed. Each museum must determine who will choose, manage, and maintain the CMS before it begins the process of implementation.

Choosing a system requires a thorough analysis of:

The information to be stored;
Who will enter the data;
How, why, and by whom the data will be used;
Present computer capabilities;
Existing data structures; and
Anticipated future growth of institutional needs.

Based on this analysis, the choice is among proprietary museum data management systems, professionally designed custom systems, and in-house designed or “open source” systems. Hardware should be selected only after the foregoing analysis and selection of software have been made to ensure that the hardware is appropriate for the type of data being stored, the database size, the operating system, and network capabilities. Conversely, existing hardware may limit the type of data management software that can be used.

Database Structure

A database is a collection of information in electronic format. The advantage of electronic collections data is that a computer can sort data more quickly, in many more ways, and with fewer errors than a person using a manual system can. A CMS refers to the database and the application software interface with which the user interacts.

The key to a useful electronic database is the method of data organization, that is, the database structure. Although there are a number of different types of database systems, the three most common are flat file, relational, and object-oriented. Flat file databases keep information in a single large table, whereas relational databases keep data in separate tables, each related to the others by means of a common field. The advantage of a relational database is that it creates discrete, easily parsed groupings of information that help with computation speed when searching, as well as with maintaining the integrity of data with controlled fields. The common fields that join tables to one another are referred to as identifiers or keys. Identifiers can be exposed to the general user as common reference points, such as an accession number, but more often, they are internal, ordinal numbering systems separate from common references. This allows for flexibility among the human-readable reference numbers—accession numbers can convey meaning to users (e.g., with textual prefixes such as D for deaccessioned objects; a loan number can represent the year in which the loan was initiated). An object-oriented database replaces the identifier-linked table structure of flat file and relational databases with a series of nested classes of objects that pass down attributes to subclasses. These flexible models are consistent with the languages that dominate object-oriented programming such as C++, JavaScript, and Python. Object-oriented databases use the same type of fields within the database and the application, removing the need for software to translate between the database fields and the output programming, which permits the system to run more smoothly overall. In practice, the dominance of relational database systems has given rise to hybrids—object-relational databases—which combine the extensible functionality of inheritance within a tabular system.

Flat file databases are rarely used for today’s complex information management needs, though they are frequently used for simpler tasks. Relational and object-oriented relational databases have many advantages over flat files. For example, a collection might contain a number of objects given by a single donor. Although there will be a separate record for each object in the collection table, there need be only one record for the donor in the donor table, linked to every record of an object in the collection table that was given by that donor using the identifier associated with that donor. If information about the donor should change it needs to be altered in only one location in the database, thus saving work and ensuring accuracy. A cardinal rule of good database structure is that no piece of information should be entered more than once. The exceptions to this are the identifiers that link related tables together, which must be present in both tables to enable the link.

Although the terminology and precise workings vary among different software products, most computerized database systems have four components: tables, queries, forms, and reports.

Tables hold stored data. Each table comprises records, which consist of sets of data about a particular type of item, such as objects or donors. Individ ual pieces of information that make up a record are placed in fields, which are the equivalent of the blanks that must be filled in on a paper form, or cells in a spreadsheet. Each field has a dedicated data type—some fields only contain text data, some only numerical data, and others contain only dates. Some fields display calculations from other fields. The CMS may require that certain fields be entered (such as accession numbers), but it may leave other fields to the discretion of the data-entry operator.

Queries, also known as searches, extract information from a database. Queries may be simple, such as a search for a record pertaining to an object with a specific accession number, or complex, such as a request for a list of all objects given by a particular donor in a given year whose value was greater than a certain amount.

Forms generally are used for one of two purposes. Designed to represent paper records, forms visually organize and simplify the data-entry process and speed data entry. Forms can be designed to reference tables in the database to ensure that information placed in certain fields is valid and that it is correctly spelled and formatted. Forms are frequently used to specify searching or sorting criteria that are used by predefined queries selected from a menu.

Reports are the means by which the results of a query are displayed or printed. Reports may be part of the software itself, may be designed by the user, or both, depending on the capabilities of the particular software. Queries, if properly structured, can use preset reports to display or print results. If a report is not used, a query usually results in a simple list that may be displayed, printed, or exported for editing.

Functionality

It is important for a CMS to be flexible. The system should allow the museum to begin with the data it already has and to expand as needs and opportunities arise. The sum of the information about any object comes from a variety of sources, such as registrars, curators, and conservators. Each should be responsible for the timeliness and accuracy of their contributions and should be able to enter information without transmitting it to a third party. The system should incorporate methods to ensure that only authorized persons are able to enter, edit, or view various pieces of information about an object. In all but the smallest museums, this generally means a networked system of computers, with the software and data residing on a central computer called a server.

Hardware Compatibility

Many proprietary CMS and commercial database systems run locally on either Windows, Microsoft, or Linux-based operating systems. Individual workstations are commonly networked in one of two ways. One method is to have the CMS software installed on each computer and have those machines access the data by means of middleware that facilitates the delivery of content between the server and the workstations. Open database connectivity (ODBC) and Java database connectivity (JDBC) are common examples of such middleware.

Another method for enacting system connectivity is to use a virtual machine (VM), which means the CMS software is not installed on every workstation but is on a single server that users access by logging on to a remote application server (RAS). The RAS behaves like an application on the local computer, but appears as an embedded desktop, with the CMS software installed there. It can be an adjustment for users to learn how to work in a second desktop active as a simultaneous, nested environment on their workstation; care needs to be taken to export and save files on the users local machine so that they can be retrieved when the remote machine is not active on the user’s workstation. However, the advantages far outweigh the disadvantages of this learning curve. Not only can the remote machine provide more flexible access because it does not rely on local CMS software installation but, for the same reason, it is also much more efficient for system administrators to upgrade, maintain, and troubleshoot a single machine rather than deal with each workstation individually.

In addition to these options, browser-based environments are being developed for commercial collection management software. Browser-based environments enable database records to be accessed from any computer, replacing the RAS connection with a common browser application such as Internet Explorer, Chrome, or Firefox. These systems have mul tiple access points—local installations on individual workstations, VMs, or browser access—that do not dictate where the database is located. Some companies offer hosting agreements with licenses that permit access for museum staff to servers that are under the upkeep of the vendor, who then takes on many of the administrative duties that otherwise occupy dedicated staff time, such as overseeing system backups and upgrades.

Ideally, computers should be chosen only after the CMS has been selected. If this is not possible, the software must be selected so that it does not exceed the limitations of the hardware. It is important to select software that can be expanded in its capabilities as future hardware upgrades occur. Consulting a technology professional to ensure compatibility of the hardware and software is desirable, regardless of whether the selected software is a stand-alone system or a networked system. One of the advantages of browser-based commercial products is that they are relatively hardware agnostic, simplifying these systems considerations considerably. This compatibility is especially critical if the software selected runs on a network with multiple computers accessing it; in this case it is important to consult a technology professional who has experience not only with the type of server and operating system selected but also with the networking requirements of the system. The consultant will be able to advise on infrastructure upgrades needed to implement the system and will be responsible for setting up the server running the software. In many cases the vendor of a proprietary CMS will install the software on the server and networked computers. Prior to selecting the software, it is necessary that the system administrator, the technology professional, and the vendor(s) discuss the requirements of the system and the infrastructure of the institution to determine whether the software is compatible with the existing framework, such as network wiring, electrical outlets, internet speeds, and physical space requirements. If not, the institution must determine whether it is possible to upgrade its infrastructure; if an upgrade is not feasible, a different system must be selected.

The computer and server hardware must be evaluated on the basis of several criteria. The hardware should be produced by a reputable manufacturer that stands behind its warranty and should be purchased from an established vendor. A number of suppliers claim that their products are equal to those of major manufacturers; this may be true, but unless the museum is a large organization (or part of one) with an information management department that can handle alternate arrangements should the producer suddenly go out of business, a computer produced by an established manufacturer and serviced by an established vendor is worth the difference in cost.

In general, the rule is to buy the fastest, most powerful computer affordable. Speed and power are especially important if electronic documents, images, video and audio files, or other multimedia will be used as part of the CMS. The computer must have an adequate amount of random access memory (RAM), sufficient hard disk or other storage space, and hardware for internal and external backups. The computer on which the data are stored—whether a server or a stand-alone desktop computer—should have a redundancy feature called a redundant array of inexpensive disks (RAID). A RAID consists of multiple hard drives that mirror the data. In the event one hard drive fails, the computer can retrieve data from one of the other disks in the RAID. The RAID is not a substitute for external backups, however, and should be used in conjunction with backup hardware. There are a number of different RAID configurations, and a technology professional can advise on which configuration is appropriate.

If the system is a stand-alone installation the computer must have enough expansion slots to accommodate equipment additions. Connecting the computer to a network, scanner, other electronic image-capturing device, or a soundboard requires the installation of a card to handle the external connections. Some external storage hardware—such as CD-ROM, DVD-RW, tape drives, or external hard drives—could also be necessary, depending on whether the institution opts for cloud storage or not.

SOFTWARE SELECTION AND DEVELOPMENT

Software Selection

Before selecting software for a database system, the institution must decide whether it will purchase non-museum-specific commercial database software and contract with a computer programmer for configuration, employ a staff member to create or customize software, or buy a CMS from a commercial vendor. Current staffing, immediate and future budgets, long-term needs, and the stability of the museum must be considered when determining if a commercial CMS or a customized in-house database product is the right solution. Each solution has certain advantages and disadvantages, but most museums today purchase their CMS from a vendor and work with the vendor to tailor the product to the institution’s specific needs.

Commercial Databases

Commercial software, such as SQL, Access, and File-Maker Pro, are economical, powerful, and flexible systems capable of being configured for collections management purposes. Although this software is designed to be customized by the user for particular applications, a reasonable knowledge of the software is required to do so. Use of commercial databases can be an economical way to begin a CMS. However, if it is not carefully thought through and managed, this option can be the most expensive and the most difficult to scale up for future needs to ensure ongoing development throughout the lifetime of the system.

One of the primary advantages of using an in-house designed system is the relatively small initial investment in equipment and software. If building a new system from scratch, the project can start out small, using the information already contained in paper records, and then expand as needs and opportunities arise. This approach is also flexible, allowing the system to be designed to fit a particular museum’s needs. However, there are major disadvantages to this approach. If done by someone whose primary task is not collections management that person’s time will be diverted from other duties. During the early stages of development, the amount of time required can be significant. A poorly designed system can waste time and money and in the end be worse than no CMS at all. If the person who designed it leaves the museum the database may be impossible to maintain, especially if the software in question is dated or modified to such a degree that advice and information become difficult to obtain from outside sources.

An institution with a truly customized database system may be at a major disadvantage if it decides to share its information on the internet or directly with other museums or consortiums. Creating a system using commercial database software requires recognition of and adherence to established standards and practices, as well as a broad understanding of information management practices at other institutions.

A museum no longer needs a computer guru on its staff to produce an in-house system, but if there are no dedicated staff on hand, it is recommended that the museum contract and work with a software developer to design the software. If not contracting with a developer, it is necessary to provide the person who will establish the system with training in database design, testing, and documentation. Initially, the system should be kept simple. Only when one aspect is working well should another be added; it is not wise to try to incorporate every desirable feature or function all at once. The new system should be integrated into office routines slowly, preferably by starting with the processing of new transactions. Only after the new system is functioning satisfactorily and the staff are comfortable in its use is it appropriate to ingest the entire collection.

Commercial Collections Management Systems

A commercial CMS, fully adapted for cataloging and collections management use, offers a museum many advantages. The system often comes to the museum as a complete package and can be used after system installation and initial training are finished. Technical support is often available by telephone, e-mail, or on a website; training is usually available from the vendor; and upgrades come without museum staff spending time on software development. Established vendors use client feedback to focus development and make certain that needs are met. The museum can form user groups with other clients and share information, problems, and solutions.

A major drawback of the commercial CMS can be the amount of money that must be spent for initial implementation. The commitment to a yearly maintenance agreement—which usually is based on the number of user licenses (i.e., the number of users or workstations that can access the networked system simultaneously) a museum has purchased—is important. There may be concerns about the stability of the company that sold the program and questions about the continuance of software maintenance should the company cease to exist. There are free, open source options available that have the potential to significantly reduce startup and maintenance costs, but their drawback is that an open source CMS is not tailored to the institution’s needs by a vendor, so the institution must use staff or hire a consultant to make modifications. Nevertheless, robust online communities have sprung up around open source tools that make this low-cost option more attractive.

A museum must plan the configuration and implementation of a commercial system carefully. Because development is not done in house, it is possible to overlook the careful review of data as they are collected and used by various departments. A single staff member should serve as liaison between the software company and other museum users. A second staff member should learn the program administration thoroughly and be ready to step in if needed. The program administrator is often the registrar or a member of the registration department—whoever takes the position should know collections management processes and collection information needs. Ideally, the system administrator should be a staff member dedicated solely to information and database management, help-desk functions, metadata standard implementation and oversight, report design, and communication with vendors and IT professionals.

Most commercial CMS products have room for growth. Additional users can be enabled and new features may be added at the request of the museum or included as part of an upgrade. As custom changes may be expensive to implement, it is important to anticipate future needs and select a system that already has most features present at time of installation. The popularity of commercial systems proves that they meet an extremely important need in the museum community. With the fast pace of developing technologies, it is logical that museums rely on specialists who can use new developments to improve functionality quickly. As such, proprietary CMS software have become the programs of choice in most museums.

Determining What Type of System to Select

Before software or hardware selections can be made, a comprehensive list of system requirements must be prepared. Aligning institutional requirements with available options requires some research and out-reach. The Canadian Heritage Information Network (CHIN) has produced a set of resources to help museums assemble their requirements, including their Collections Management Systems Software Criteria Checklist, which outlines features commonly included in a commercial CMS (last updated in 2012), as well as a collection of current vendor profiles.¹ Mostly, it is helpful to interview similar institutions about which features they find especially useful.

Request for Proposal

Whether a museum has decided to purchase a commercial CMS or to hire a programmer to develop an in-house system, a request for proposal (RFP) is the appropriate document to send to a small group of selected vendors. Look for vendors who have experience working in the institution’s discipline because educating the vendor about workflow will be much easier if the vendor has worked with similar institutions.

The RFP must provide the vendor with an exact, comprehensive, and clearly written outline of the product the museum needs. It must include a cover letter describing the museum and the project, a description of the existing computer environment, a functional requirement section, and instructions for submitting a proposal to the museum.

The bulk of the RFP will be the functional requirements section, which lists the desired characteristics of the CMS. It is advisable to divide this section into parts. For example, a section on system requirements will contain information about the network in place and whether the museum intends to continue to use it (if it does, the vendor’s system must be compatible with the network). The museum may request that the system have an application programming interface (API) for website, kiosk, or app development, be compatible with different operating systems, or accommodate different user groups with various levels of security. Questions in the RFP should be phrased so that they can be answered by the vendor with yes, no, or modify.

Further subdivisions of the functional requirements section should address specific issues for each of the modules the museum expects to use (e.g., object tracking, exhibition registration, accessioning, multimedia).

The other significant section of the RFP contains directions for the vendors’ responses:

The deadline for the museum to receive the proposal (usually four weeks from the date of the issuance of the RFP).
The format for the proposal (because consistency of format is important for museum staff to be able to compare proposals directly to each other).
Clear instructions as to what must be included, including license fees, annual maintenance costs for a specified number of users, training costs, accessibility of support staff, extra costs for data transfer, and the anticipated project timeline.

It is crucial that all vendors be treated equally. Due dates must be met and the proposal must be complete for the vendor to be eligible for consideration.

Note that a vendor is legally liable for answering the RFP truthfully because it serves as the basis for a future contract.

Evaluation

Evaluation of proposals and selection of a vendor is a surprisingly labor-intensive task, and it is important to let the vendors know of your selection as quickly as possible.

Upon receipt of each proposal, check to see if all the items requested in the RFP are present. As soon as the deadline has passed, organize the proposals so that they may be easily compared. Establish a rating system that everyone agrees upon and give each person on the selection committee a set amount of time to scrutinize, compare, and rate the responses. Narrow the selection to no more than three proposals and observe the selected systems in use. If possible, ask the vendor to preload some of the museum’s existing data for a demonstration. The vendor can send a representative to present an on-site demonstration, can coordinate demonstrations at a professional conference, or offer a demonstration online, using a remote desktop connection or live webcast.

After the demonstrations, contact colleagues at similar institutions who have already installed each of the systems under consideration. Ask them for a frank and honest evaluation of what they like and do not like about the systems. Ask about the vendors’ responsiveness and any unanticipated costs they encountered during or after installation. If at all possible, the internal selection committee should visit other museum sites where the systems under consideration are installed and running.

The committee should then be able to reach consensus on the product.

Contract Negotiation

Once the vendor is selected it is time to buy. Even if the system is for a single user, it is advisable to have a contract with the vendor outlining exactly what is included. The proposal provided by the vendor in response to the RFP is the cornerstone of the agreement and should be referenced in the contract. The contract need not be elaborate. Many vendors will provide a sample contract that both parties are free to amend, as long as they both agree on the amendments. The contract should set dates for acceptance of the system, a schedule of payment to the vendor, return policies, training and support agreements, data conversion, and requested modifications to the system. It may also state that the source code will be placed in escrow in case the company should dissolve at a later date, but there is usually an extra cost associated with this.

IMPLEMENTING A COLLECTIONS MANAGEMENT SYSTEM

Implementation

The who, what, where, when, and how of a data management system come together in a concrete fashion during the implementation process. The adequacies of the chosen system, hardware, software, and site must be tested during the initial implementation process. The objective of this phase is to identify and correct defects so that the ultimate goals of efficiency and productivity can be achieved.

Site Preparation

Site preparation takes into account the safety and efficiency of all aspects of the system—hardware, software, and the people using the equipment. Sites can take many forms, each of which requires a different type of preparation. Sites can be centralized or dispersed, independent or connected, and permanent or mobile.

Centralized versus Dispersed

Centralized sites avoid the need for extensive networking and may allow for the design of a specialized computer room. Such a room should have requirements for temperature, relative humidity, and dust filtration similar to the collections storage areas and should provide maximum efficiency of computer use: technical support easily available, comfortable and safe working conditions, and strong security.

Dispersed sites enable customized or specialized workstations that fit the needs of particular museum users. For example, registrars can enter acquisition information, loan inventories, and so on, from their workspace without having to take materials into a central area with a higher level of traffic flow. Likewise, conservators can set up a workstation to use when doing condition and treatment reports, or secure workstations can be set up for visiting researchers in areas that will not be in the way of other museum functions.

Independent versus Connected

Whether sites are centralized or dispersed, connectivity may or may not be desirable. Connected sites permit easy communication, standardization, and control, but these networked sites almost always require a system administrator, increased security, higher costs, and modifications to architectural infrastructure. Wiring a museum for networking can be a problem; the construction of many of the old buildings that house museums is so solid that passing cables through walls, ceilings, or floors may not be possible. In these cases, wireless networks may be the solution. Thinking ahead and wiring during times of remodeling or new construction will save money and headaches in the future.

Permanent versus Mobile

The mobility of a computer site also is important to consider. Permanent workstations usually provide greater computing power, full-sized keyboards, and larger monitors. The ability to move computers, however, is advantageous in many situations. Computers can be taken into collection areas for direct data entry during inventory, condition checks, or environmental monitoring. Having the means to take digitized records or images to meetings, libraries, and off-site locations is convenient and can solve potential access and security problems. Mobile computing, however, has its drawbacks. Theft becomes a greater issue when using small, portable computing equipment and, if networking is required, using a wireless system becomes the only practical solution.

Whatever the configuration of computing sites, the computers and network equipment should be assembled and tested prior to installation. If possible, the data servers should be kept in a secure area where they are accessible to authorized personnel but protected from vandalism and theft. Backup tapes, hard drives, and other portable data storage units should be stored in a safe and secure place, preferably off-site.

Like other machinery, computer hardware should not be exposed to extreme changes in temperature and humidity and should be protected from particulates (dust, crumbs, etc.), liquids (leaky pipes, spilled coffee, etc.), and other debris. Adequate and stable energy sources and sufficient access to them should be provided to avoid data loss and undue stress on the computer hardware. Like other electrical equipment, computer hardware should be placed in a low-static area away from magnetic fields. Wires and cables should be installed so that people do not walk or trip on them, equipment does not roll over them, and animals do not chew them. Cables and the backs of computers should still be easily accessible. System disks (if any), warranty, and other legal documentation should be safely stored in a place where it will be safe but accessible when needed.

Many work-related injuries are caused by inadequate equipment and poorly designed work areas. Chairs and desktops should be ergonomically designed and organized to minimize injuries from typing at a keyboard, sitting, and viewing a monitor. Light sources should be as low as possible; focused away from computer monitors to help reduce eyestrain.

The goal of an efficient workspace is data integrity. Ideally, the workspace will have a large enough space to organize necessary paperwork and support documentation without jeopardizing the quality of data entry or the safety of equipment and objects. If accessioned objects are to be handled at the computer workstation for cataloging, condition and treatment reports, and so on, the work space must be arranged so that the risks of knocking objects over, placing documents or other objects on top of them, or losing pieces altogether are eliminated.

Schedule

The schedule for implementing a CMS can be divided into two stages: installation and execution. The actual time needed to achieve a functioning system will depend on the hardware and software planning and testing that was completed prior to implementation. The implementation process begins when a museum undertakes a needs assessment and concludes that change is needed and can be achieved. Whether a museum chooses an in-house solution or purchases a commercial CMS, a similar implementation process follows: analysis of the institution’s needs, assessment of available resources, evaluation and selection of a system, followed by purchase, installation, and testing of the system.

A certain amount of time will be lost when switching between old and new systems. At a minimum, conversion to the new system will involve training existing employees to use it. If files or data are imported from another system they must be converted for use in the new system, which may be time-consuming. It is necessary to examine the information structure of the new system and clean up or reorganize existing data to correspond to it, especially if data entry in the past has been inconsistent or incomplete. Further interruption in use of the file-management system may involve hiring and training new employees to transfer data manually (initial entry from manual card files, cleanup and tweaking of existing data, etc.) and training existing employees on future applications.

Good system design results from reverse engineering. Starting with the needed results will naturally lead to appropriate software and, in turn, hardware. Some museums start a new system without knowing what they will or will not be able to do with it, which usually results in further downtime, general frustration, and low morale. How these factors affect individual institutions depends on the size, type of system, and type and frequency of use of that system. Ideally, scheduling should be structured enough to be completed at a time that is appropriate for the museum yet flexible enough to allow for unpreventable or unexpected setbacks.

Testing

Testing should be done in stages. Initial testing is best done by those designing or installing the system. Secondary testing is essential and should be conducted by those who are the day-to-day users of the system. Small, workable amounts of actual data should be used for tests. All operations of the system that are desired should be tested, as should the vendor’s claims about capacity and performance. Although this may not prevent software errors that result from using larger amounts of data, more files, or different configurations of data, it is both an initial, short-term way of testing a system prior to becoming fully engulfed in its implementation, as well as a means of familiarizing staff with the software interface.

Data Management

Data are both resources and assets. The data preserved about an object give that object its meaning and value. A digitized system opens up new avenues of access to the information recorded about an object, but the functionality of any information management system rests on data entered into it. Poorly managed information in a manual system will continue to be poorly managed information when moved into a database. The expected benefits of digital recordkeeping—especially the sophisticated searching, instant retrieval of related information, and statistical reporting—will not be met if data are inconsistently entered and controlled.

In the process of choosing a system, a museum should review its requirements and the intended uses of the system. The collections database manager should undertake an inventory of existing documentation systems and standards to determine which departments or individuals collect and record data and how they do so. This not only will identify data sources within the museum, it also will help determine what data should be used in a new software system and who will provide or enter it. This will clarify how the museum anticipates a new database system will streamline existing workflow and increase capabilities by systematizing new relationships between people and processes. The institution must establish a shared understanding of what it expects of a new system before it is installed.

When the database consists of information entered by staff across multiple departments, each department must be responsible for the accuracy and consistency of the information that it enters and understand the origins of data that it is not directly responsible for. All users need a clear idea of precisely where in the system they enter their data and how their contributions relate to other information and other users. If there are many sources for the data it is a good idea to assign oversight responsibilities to one person—often a registrar, a collections information manager, or the database administrator—who can review new entries and edits for accuracy and consistency.

Data entry and upkeep are expensive processes, and the resulting database provides the facts and context that validate the value of the object(s) described therein. Much like caring for objects, the process of caring for data can be complex. Unlike many physical objects, electronic data often can be re-created and rebuilt, but the resulting cost in staff time can have an impact on other activities in the museum.

Although it is useful to access all the available information about an object in a CMS, there may be resource constraints that dictate how robust electronic records can be and how much data entry and maintenance is realistic for a given institution. At the time of the aforementioned inventory of existing information systems, the staff overseeing the database should work with curators, conservators, and registrars to decide what kinds of information will remain in analog systems with pointers from the CMS. Previous object wall texts, catalog entries, confidential donor information, bibliographic references, and detailed provenance information are examples of data that may be identified as nonessential for the database, but it is important that each external resource be acknowledged in the object record (e.g., “See Curatorial File for Provenance”).

It is important to define the minimum level of completeness a record must have to meet institutional requirements for physical, intellectual, and administrative control of the object that the record represents. These three aspects of collections information management—physical, intellectual, and administrative control—broadly reference the museum’s custodian-ship of its collection by tracking activities pertaining to location and preservation, interpretation and provenance, and institutional processes, respectively.

All information that goes into a CMS should be dictated by the museum’s needs and not by the computer system. It is not necessary to enter data just because there is a field that will accommodate it. If possible, the system administrator may wish to hide unused fields to prevent users from entering information into them. Doing so will help ensure consistent locations of specific information. A realistic approach to the staff time and resources available checks the urge to catalog every field. It is also the collection information manager’s responsibility to assure that the superficial layout of the program interface does not dictate the cataloging practices of the museum. Instead, cataloging should align with the table schema that supports the interface. If possible, the system administrator ought to customize the interface and reporting so that the underlying data structure is sound. This way, future design changes, software updates, or migrations to a new CMS will not be compromised by cataloging workarounds made solely to keep information front and center in a software vendor’s default design settings. This concern is a moot point in home-grown CMS design, but it is worth emphasizing that the temptation for cataloging staff and researchers to see the information they consider important needs to be balanced with the integrity of the schema.

Review, compare, and select documentation and data standards that will ensure that users enter correct and complete object information. In certain institutions it may not be possible to find a perfect match, so modifying or combining data standards to fit the museum’s needs will be necessary. It is usually possible to set required fields so that new records cannot be saved if they do not conform to the adopted documentation standard. Examples of such information include accession number and object location. If the date, medium, or maker information are not be available at the time of record entry it is important to eliminate future confusion by noting whether that information was omitted for administrative reasons (and ought to be obtained and entered at a later date) or if the label copy date should read “n.d.” (no date), for example, implying that the matter is settled. Using a field value that consistently identifies pending label copy data has the advantage of simplifying searches on incomplete records.

Data Standards

Data standards focus on how information is structured and entered into a collections management or cataloging system (manual or automated) and how that system maintains the information and provides a framework through which the information may be retrieved and manipulated. Data standards are concerned with three elements—structure, content, and value.

Data structure standards provide guidelines for the structure of a documentation system—what constitutes a record, what fields or categories of information are considered essential information, what fields are optional, and how those fields relate to one another. Data structure standards determine how much and what kind of documentation will meet the organization’s criteria for security, accountability, and access to the object. For example, the CIDOC Conceptual Reference Model² (CRM) from the ICOM International Committee for Documentation’s Documentation Standards Working Group and the Europeana Data Model³ (EDM) are formal, semantic ontologies for structuring cultural data. Categories for the Description of Works of Art⁴ (CDWA) is a comprehensive set of categories for describing works of art and related images and a format for electronic exchange of information, maintained by the Getty Vocabularies.

Data content standards provide guidelines for defining each individual data element or field and what information should be entered into it. Data content standards clearly describe the content of a defined field or data element and provide guidelines for controlling the syntax, style, grammar, and abbreviations used within each field. These standards are usually adopted into internally developed cataloging rules within institutional data dictionaries and procedural manuals that outline applications of external sources such as Cataloging Cultural Objects: A Guide to Describing Cultural Works and Their Images⁵ (CCO) from the Visual Resources Association (VRA).

Data value standards determine the vocabulary to be used for individual fields or data elements and the authority lists that will build consistency and enable interoperability. Terminology standards that are used consistently enable an electronic system to provide indexes that find like objects quickly and connect them to other objects in interesting and sometimes unexpected ways. Consistent use of data value standards protects the museum’s investment in its data and provides many more points of access to an object than can be provided in a manual system. These standards may be externally or internally developed authority files, lexica, thesauri, and controlled vocabularies. For example, the Getty’s Art and Architecture Thesaurus⁶ (AAT), the Union List of Artist Names⁷ (ULAN), the Thesaurus of Geographic Names⁸ (TGN), Cultural Objects Name Authority⁹ (CONA), and Nomenclature for Museum Cataloging¹⁰ encourage institutions to use common terminology and reference resources.

The benefits of applying data standards within a CMS include:

Maximized investment in the data in a system;
Enhanced accountability for a collection;
Easy access to records and, thereby, to objects;
Fast retrieval of cross-referenced information;
Improved quality and accuracy of the individual record;
Data that adapt more easily to new technological and documentation developments;
Data that can be exported more efficiently into a new system; and
Simplified interoperability (the exchange of information with other programs within an institution or with other institutions).

Even when data standards have been established and are in use, it is necessary to review and revise them. There invariably will be object information that cannot easily fit into any organized system. Although the aim of using standards is to be consistent, it is also necessary to be flexible enough to accommodate unique situations. At the very least, cataloging staff should support all exceptional applications of structured (controlled dropdown) data fields with explanatory unstructured (memo-style text entry) remarks.

Terminology Control

While manual systems might use catalog cards that provide a limited number of access points for information retrieval—for instance, by accession number, classification, or maker—a CMS can provide access to not only any field in the schema but also to any combination of fields. How dependable this access is depends entirely on the precision and consistency of the language used. Establishing the consistency upon which optimal retrieval functions hinge is called terminology control.

Terminology control is necessary because natural language has a number of different words that mean the same thing (synonyms) and identical words that mean different things (homonyms). If synonyms have been entered in a field, it will be impossible to retrieve all the similar objects without knowing and searching for each term individually. To control synonyms in a CMS it is necessary to choose a single preferred term and use it in place of all nonpreferred terms. Some advanced systems have internal synonym rings that can retrieve all items cataloged with the words within the ring, regardless of the term entered in the query, but this requires all synonyms to be manually entered into the dictionary. If homonyms have been entered in a single field, a search on that field will retrieve irrelevant records. To control homonyms it is necessary to distinguish one homonym term from another—for instance, barrel (container) and barrel (firearm component)—otherwise, the system may not be able to differentiate one from the other in a search.

To develop a consistent vocabulary, it is necessary to use some form of terminology control such as an authority list or a thesaurus for each data element or field in a CMS for which it is determined that there are terminology control requirements. An authority list is a controlled list of terms considered to be acceptable for entry in a field. The simplest authority list may provide only a set of preferred terms, developed as a simple document or spread sheet and shared between departments. A more complex authority list may provide nonpreferred terms with cross-references to the preferred terms.

Expanding search utility even further, a thesaurus provides terms that relate to the preferred term in a broader or narrower sense. A thesaurus is a highly structured authority that defines the terms that most accurately describe an object or concept for indexing and retrieval purposes. Like any authority list, a thesaurus distinguishes instances of nonpreferred terms, synonyms, and homonyms, but it will also include variant spellings and narrower, broader, and related terms, creating a hierarchy that enables users to retrieve records that contain only their search term, or a more dynamic search that collocates records that have been cataloged with any broader, narrower, or related terms as well.

It is not necessary to control the vocabulary for every data element in a CMS record. The focus should be on those fields that will be essential for indexing and retrieval purposes. Fields that may require terminology controls include:

Classification,
Object name,
Subject heading,
Location name,
Medium,
Technique,
Condition,
Geographic place names,
Period or style terms,
Acquisition terms,
Deaccession terms,
Department names,
Artist or maker names, and
Artist or maker roles.

When considering what terminology controls to use in a database, it is paramount to first consider the use of established resources that are available for the entire field before developing in-house resources. On a practical level developing internal authorities can be a time-consuming and labor-intensive process. An externally developed authority may help a museum avoid repeating work that already has been done and accepted by the museum community in accordance with international standards. Adopting external standards still requires adaptation and effort because it takes coordination across departments to familiarize staff with these resources and to collaboratively navigate the descriptive vocabularies of value standards to identify terms that are most relevant to the collection. However, when an institution chooses external standards, it increases the relevance of its collection by joining a larger network and contributes to the development of consistent terminology controls throughout the museum community.

Linked Data

External standards ensure structural cohesion and terminology controls for museum collection cataloging. The established field names, styles, and values not only provide a linguistic roadmap for data entry decisions, but they also lay the foundation for broader exposure and more meaningful search in and among museum collections. In this age of networked information, external standards can be used to embed machine-readable, shared identifiers for use as entry points to collections data on the World Wide Web. The incentive for choosing external standards is not only to align institutional practice with the wider museum community but also to actively build that community by creating extensible linked data across institutions. This, in turn, has the potential to join the world’s cultural collections in a seamlessly searchable virtual repository, complete with inferred relationships between records that result from mechanized self-describing data, leveraged on semantic modeling (although this is beyond the scope of this chapter, there are recommendations for further reading in the Resources section).

The first step to practical application of linked data is to include the universal resource identifiers (URIs) associated with chosen terms within the catalog records for every standardized entry. Begin with individual standardized fields, so that one data entry project aims to identify the artist constituent records or medium object attributes within the collections database. For example, Frida Kahlo is linked in the Getty’s Union List of Artist Names with the URI 500030701, and in Wikidata with the URI Q5588. Similarly, copper (the metal) has the URI 300011020 in Getty’s Art and Architecture Thesaurus while copper (the color) has the URI 300311190. These individual building blocks are small and the full benefits are still brewing in the international cultural stew-ardship community. Given the many developments underway and how much work there is to do, it is an exciting time to be a collections information manager, advocating for the broad implications of fully implementing external standards through linked open data strategies. These efforts will ensure the future relevance of institutional collections within the digital humanities and among the museum audiences of tomorrow.

Descriptive Standards

Many projects have developed terminology and linked data standards. Some examples include:

The Art and Architecture Thesaurus (AAT) is a list of art history and architectural terminology developed as a controlled indexing language for the use of libraries, archives, and museums in cataloging book and periodical collections, image collections, and museum objects, particularly art-related objects. AAT terminology has been validated by users in the scholarly community and includes index terms that may be used to control a variety of fields in cataloging and CMSs, including object names, object genres, attributes, style and period terms, people roles, materials, and techniques. The AAT is an ongoing project that continually develops terminology and is especially responsive to user feedback. It is available in both electronic and print editions and is sponsored by the Getty Research Institute of the J. Paul Getty Trust.
Nomenclature for Museum Cataloging¹¹ is a standardized classification and controlled vocabulary for human-made object names. It is particularly useful for collections of historical objects. Nomenclature is indexed alphabetically and hierarchically. Object names are based on the original function of the object, with hierarchical divisions that include structures, building furnishings, personal artifacts, tools and equipment for materials, etc. For example, the category of building furnishings has subdivisions that include bedding, floor coverings, furniture, and so forth. Object terms are inverted with the noun first, followed by a comma and the modifier (e.g., chair, dining; chair, easy; and chair, folding). In an alphabetical listing of object names all chairs and like objects will display in the same place on the list. Preferred terms are displayed in capital letters and nonpreferred terms direct the user to the preferred terms. Nomenclature is available in a print edition and is sponsored by the American Association for State and Local History (AASLH).
The Thesaurus of Geographic Names (TGN) is a list of hierarchically arranged geographic terms. Place names are arranged in terms of broader and narrower localities, and the thesaurus features preferred and nonpreferred terms. TGN is maintained by the Getty Vocabularies.
The Union List of Artist Names (ULAN) is neither an authority list nor a thesaurus but was developed to serve as a terminology resource for museums, libraries, and archives. ULAN features a cluster format that displays and links all the variant spellings and versions of an artist’s name as well as basic biographic data that includes life dates, roles, and nationality information. It is up to the user to choose the name that is preferred for cataloging and indexing, although ULAN provides some helpful information concerning the sources of the names and the use preferences of the sources from which the names have been drawn. ULAN is maintained by the Getty Vocabularies.
Iconclass¹² is an iconographic classification system that provides subject and content terminology for art and historical images. It features a series of decimal codes within a hierarchic structure. It has ten main divisions that feature the primary subject headings of abstract, nonrepresentational art; religion and magic; nature; human being, man in general; society, civilization, culture; abstract ideas and concepts; history; bible; literature; and classical mythology and ancient history. Within these divisions are additional numbers and letters that, when combined in a string, represent important elements in an image using a descending hierarchy of concepts, beginning with a major concept such as history and ending with a specific concept, such as a precise event in history.

Iconclass is not the only system based on subject content; other projects include the Garnier System (Thesaurus Iconographique, Systeme Descriptif de Representations)¹³ used widely in France; the Yale Center for British Art Project; and the Glass System (Subject Index for the Visual Arts) developed for the Victoria and Albert Museum.

The aforementioned projects do not form a comprehensive list of available resources. As terminologies shift and consortiums are formed to develop new dictionaries, it is useful to talk to other professionals in similar institutions to determine which thesauri they are using. If there are no existing thesauri, it can be helpful to request the lexicons of similar museums to build on their existing work.

Data Dictionary

A data dictionary provides documentation on the development and application of institutional data content standards. Foremost, it contains a scope note of each field, identifies the type of entry (free memo text, numeric, date, controlled lists, etc.), delineates the wider standard it adheres to, identifies access rights and the department or staff responsible for entering and maintaining the field, and provides data entry guidelines or defines each value of controlled lists within the field, if applicable (see discussion herein for a more extensive list). At an administrative level, the data dictionary chronicles decisions made in administering the field, which allows a museum to examine the nature of the object information it preserves and creates an explanatory record of past decisions.

Although crucial as a record of institutional memory, data dictionaries are also living documents in museums with which more than one department creates and maintains database records. When provided with a data dictionary, each department has documentation that details the scope of its own and other departments’ responsibilities for entering data concerning an object. The data dictionary is the basis for an instructional cataloging manual—a visual explanation of the step-by-step procedures of records creation and management. Together, these provide a reference to guide users throughout an institution in entering data consistently.

While a data dictionary usually will be developed in house, one also may be provided by the vendor of an automated CMS. Users should review a vendor-provided data dictionary carefully, determine whether they agree with the definitions for the data elements that the vendor proposes, and revise them, if necessary, to suit their own particular needs. Concentrate on those things that assist in effective use of the documentation. A selection from the following information is useful when creating a data dictionary entry, although not all of the following information is required:

Module name—In a relational database there may be more than one place in which information is entered. Identify the module (objects, locations, exhibitions, etc.) where the user will find this particular data element.
Field name—The name of the category, field, or data element.
Table/Column Name (in relational databases)—The name of the column and table the column resides in on the back end.
Field code—The code the system uses internally.
Standards—The various data standards to which the field corresponds. Such standards may include terminology dictionaries, such as the Art and Architecture Thesaurus or Nomenclature and metadata protocols.
Attributes—Identify the attributes of the field, note whether it is a fixed or variable length, alphabetical or numeric, repeatable, etc. Note if there is a controlled vocabulary that will verify this field or special keystroke commands or controls associated with this field.
Access—Identify which departments or users may access and enter or change data within this field.
Description—Define the field, stating its purpose and the character of the information that will be entered into the field.
Data-entry rules—Provide a list of conventions for data entry in this field. Note any exceptions to the rules. For example, note if dates are entered month or day first. If an authority or controlled terminology is in use for this field provide directions for using the authority and definitions of controlled terms, or direct the user to the appropriate information.
Examples—Provide examples of data entered in the field.
Indexed—Note whether the field is indexed.
Public Access—Indicate whether the field is pushed to public repositories (online collection, aggregate repositories) or apps (mobile, kiosk).
Cross-reference points—Which other modules the field is related to and where this data can be drawn into other records (e.g., object valuations available for loan insurance records, institutional or individual addresses available for loan and shipping records, etc.) or into other databases (e.g., library catalogs or digital asset management systems [DAMS]).
Administrative remarks—The dates and reasons for decisions that expanded or changed the field, especially in the case of high-level authority lists such as department or classification.

A sample data dictionary entry for an accession value field might include:

Field name—Accession number

Field code—ACCNUM

Table/Column Name—Objects.ObjectNumber Standards.

CDWA—21.2.3. (repository number)

21.2.3.1. (number type: “accession number”)

CIDOC CRM—E42 (identifier)

ISAD code—3.1.1 (identifier)

EAD tag—<unitid>(identification of the unit)

Attributes—alphanumeric (unique value)
Access—write access for registration and administration groups, input for curatorial group for new records only, read access for all.
Description—The accession number is the unique identifier assigned to objects as they are received into the museum. For detailed accessioning and numbering procedures, refer to the Acquisition Procedures manual.

Create individual records for each object (an object refers to an item that is cataloged and displayed individually). For example, a portfolio containing seven prints would have seven or more objects, including the cover, colophon, and any additional materials. A cup and saucer are considered one object in two parts, and denoted with parts letters (a-b). In this example, each piece of the portfolio is cataloged individually, and the cup and saucer have only one record entered into the CMS.

Examples:

2017.5
2017.1a-c
2017.2.1
2017.713
2017.7.1.1a-b
LIB.87.2.2
ARC.2016.011.6.1a-b

Data-entry rules:

Always use decimal points between the numeric values of the number.
Use hyphens between parts letters (e.g., a-c).
For objects with special collection distinctions (LIB, ARC), insert a decimal point between the letters and the numbers.
Do not use punctuation marks between numbers and letters.

Useful data dictionaries are not created in a vacuum. Curatorial, conservation, and registration departmental input is crucial to defining the scope of fields to assure that each department’s controls are accommodated by the system and to establish each staff member as a stakeholder of collections data with an awareness of their own responsibilities for the integrity of the records. Furthermore, no matter how thorough the documentation is, there are always problems or details that will be exceptional. Meetings with users provide the opportunity to discover elements of information that each field entry in a data dictionary can and cannot accommodate. It is ideal to schedule regular meetings with a dedicated group of representatives from across the museum that are comfortable liaising on behalf of their departmental colleagues to raise issues, express preferences, and approve changes to collection information strategy.

Meet regularly with users while the data dictionary entry is being generated.
Listen to suggestions, and implement them.
Use meetings as an opportunity to build consensus because data dictionary conventions that are imposed on users will be ignored.
Always provide a reason for a convention; users are more likely to use a data dictionary that makes sense.

Finally, data dictionary entries constantly evolve and are adapted to new uses and situations. It is essential that they be kept up to date; otherwise, users will not be able to depend on them as reliable and definitive sources of information.

Training

Museums may hire information specialists with museum registration or curatorial backgrounds to manage the information needs of the institution. Information specialists should have a background in database management and understand the importance of data consistency and adherence to standards. Such personnel must bridge the protocols developed internally with the standards created by the broader museum community. Because of this, it is important to include information management staff in any discussion about cataloging, data standards, registration methods, and information access. They should share their expertise while adapting standards to existing needs.

Information specialists are often responsible for training staff to enter data into the system. Training responsibilities can be divided into two categories—initial and ongoing. Initial training is generally for those who are new to a specific software program and must be shown the basics. Training must be repeated whenever a major change in hardware or software takes place, such as the installation of a network or a new CMS, or when new staff members are hired. Ongoing training includes keeping the museum staff informed of software upgrades and new software and may be as simple as reminding staff of the importance of good housekeeping for computer files or the importance of making regular backups.

Staff training can be accomplished through different methods, including taking advantage of outside training opportunities or contracting with private companies, CMS vendors, or individuals for in-house training. Outside training means that staff must travel to another location so it must be determined whether it is a good investment for the entire staff to go for outside training or if sending one member who can come back and conduct training for the rest is more efficient. If it is decided to keep training in house, the collections information manager might establish a user group that meets periodically, send internal e-mail newsletters that offer tips, and hold office hours weekly where staff can come to troubleshoot issues or receive project-specific training. Collegial sharing of information through e-mail, user forums, and conferences greatly enhances the staff’s ability to remain informed about technology advances.

Training staff to perform data entry is an exacting task in a museum because of the need to establish an accurate database. Training data-entry personnel for this task has been effectively accomplished by many museums through the use of data standards and carefully designed manuals to explain the process. The fewer decisions the data-entry person has to make, the cleaner the database.

Data Entry

Data entry into a new system can be one of the most expensive and time-consuming aspects in implementing a CMS. Furthermore, data entry is never completely done unless the collection is static. Not only does retroactive data entry often have to be done, but responsibility for ongoing data entry also must be assigned. If there are data in machine-readable form (such as an electronic spreadsheet), then there is a good chance they can be mapped by the vendor or system administrator into a new data structure. It is a good idea to plan for this conversion rather than to reenter all the data from another system manually.

When there are no data in machine-readable form, manual entry into a system is necessary. The museum may have personnel enter data from cards or catalog sheets, which is a time-consuming task because source material is often inconsistent and because there is a tendency to attempt to record all information possible for each object. Depending on the size of the collection, it is generally better to identify ten or twelve key fields of data to enter for each object to create a minimal identifying record for it. Initial records can be expanded later, as time permits. Data entry by hand can be extremely slow. Even a collection of modest size would require hiring a full-time person to do nothing but data entry.

Planning Data Entry

A commitment to record all collections management activities on the computer as soon as the database structure is up and running is important, and a plan must be made for the systematic data entry of backlogged records. Priorities for data entry may be determined by what is most important for insurance tracking and reporting or by what objects are currently on exhibit. It is logical to enter backlogged records chronologically, working from the most recent to oldest. Other approaches may be based on object type, donor or source, or objects currently on loan. Whatever the priority, a goal-oriented plan should be drawn up to complete data entry systematically and make the database useful immediately.

Establish a system for keeping track of progress, count or estimate the number of paper records, and measure daily progress against this total. Objects without accompanying records should be given numbers, listed as found-in-collection, and have minimal working fields entered for each object. Ideally, a staff member will be responsible for evaluating data entry as it is completed; this person may be the information manager, a curator, or the registrar. In some cases, it is a multiperson effort, with a few select staff members entering data into a few fields for which they are responsible. One person, such as the collections information manager, should have final authority to approve a record as correct and complete. For example, the registrar may enter an object’s accession number, location, and provenance information. The curator will write a detailed description, assign a date, and identify the maker. The information manager will then check these fields for proper data entry conventions.

Data Clean-up

Raw data can be entered as it exists and cleaned up later, or it can be corrected before data entry. Conducting cleanup during data entry, especially when undertaken by multiple individuals, is problematic because it necessitates employing highly skilled data entry people who have subject-specific knowledge, understand the need for consistency, and are dedicated to the task full-time. Cleaning up data before entry will almost double the time required to get a database up and running. If all of the existing data is entirely in paper form, it must first be transcribed to a central repository with standardized formatting, and then corrected for content, spelling, and punctuation. This procedure yields good results and allows the use of minimally skilled people for data entry, but it is time-consuming.

When transferring data from one system to another, exporting the data from the old system into an electronic spreadsheet can greatly assist in data clean-up. The spreadsheet’s find-and-replace, sorting, and filtering capabilities allow the data clean-up staff to find inconsistencies and missing information at a glance. The spreadsheet’s most straightforward columns can be mapped to the new data structure, greatly facilitating the import of information into the new system. Depending on the quality of information in the old system, this may take a few weeks to several months. It may be useful to clean up the data in batches, by object type or department, and import it as each batch is completed. Keep a duplicate master copy available to allow staff to access the information during the data clean-up process, even if they are restricted from making changes.

Most museums cannot afford the time required to correct data before data entry and, therefore, must rely on cleaning it up after it is entered. Reports can be generated for registration or curatorial editing, with subsequent corrections made individually or globally, depending on the type and amount of information being corrected.

Proofreading

Humans make mistakes, and any computer database will reflect the mistakes made by the person who entered the data. If minimally skilled people enter data, the chances are great that many mistakes in spelling, punctuation, capitalization, and so on will occur. These mistakes may also be derived from the original paper records. It is important for the person overseeing the project to identify the source of the errors and take corrective action. Representative samples of records, selected either randomly or in conjunction with another project, should be proofread on a regular basis. Proofreading and corrections should be the responsibility of only one person (e.g., the person in charge of the project) and record approval should be indicated by the name of the proofreader and the date of approval.

SYSTEM MANAGEMENT AND INTEGRATION

System Management

Management is the backbone of a computerized file system. System management, supervised by the system administrator, involves the maintenance and protection of the system hardware, software, and data, and their integration in museum operations. Although the system administrator is the designated leader, all individuals using the system should take part in system management. Researchers and other individuals with access to the system should be trained in how to report problems, and staff members should have an efficient means for communicating their needs to the system administrator.

System management in museum settings may include scheduling with an in-house system manager, online technical support from an outside vendor, data-logging notebooks or bug sheets that are periodically reviewed, and e-mails and telephone calls to outside technical support advisors. An institution with many computer applications may designate a system manager and a program administrator; the registrar is often one or both of these.

Maintenance

Maintenance relies on thorough and efficient communication between system users and system managers. Although scheduled maintenance tasks—such as backing up and spot-checking data—should be conducted regularly, meetings with system users to determine the continued usefulness of the system as a whole are also a part of general maintenance. If the system no longer functions at its maximum level of usefulness, upgrades and improvements should be considered. Many institutions face the simple problem of staff members who accumulate data and save it on their desktop computers; maintenance involves centralizing information so that all users can report what they have completed, what they are working on, and what projects involving the system may be ready to start. The system or program administrator should maintain a master schedule of projects and progress.

Security

A computer system, and the information it contains, represents a major institutional investment. Just as employees are informed of the general museum security plan, users of the computer system must be educated regarding electronic security. There are three main aspects to computer security, which are similar to the security requirements for a museum collection:

The system must be protected from physical loss or damage by human actions or environmental causes.
Information must be protected from unauthorized changes or deletion, accidental or intentional.
Information that must remain confidential for legal, ethical, or security reasons must be safeguarded. Confidential information includes records regarding donors, valuations, storage locations, museum security systems and procedures, and employee records and personnel files.

An institution with more than two employees, or one that is part of a wider electronic network, needs password protection for its computer system. For relatively simple systems, single password entry may be sufficient, but do not rely on screen-saver password programs because they are easy to defeat. If a system has multiple users, is on a network, or relies heavily on computerized collections management data, a hierarchical system of passwords should be in place. Password protection can cover an individual’s files or the official museum records. Levels of access and authority can range from read-only access to higher-level permissions to edit, add, or delete information. Confidential fields in a database may be restricted to holders of specific passwords. Take care to use best practices in password security, such that passwords for the highest levels of access and security are complex (e.g., contain a mix of letters, numbers, and symbols) and are changed periodically.

Physical security includes not only restricted access but also favorable environmental conditions. Computers should be cleaned and maintained on a regular basis. Much like collection objects, computers and computer media require a relatively high level of environmental control, and they are especially sensitive to heat. Keep computers and media at an even temperature and away from windows and internal heat sources. Avoid power surges by using a good suppressor and an outlet dedicated to computer equipment. At the very least, the data servers should be connected to an uninterruptible power supply (UPS). Other electric appliances and equipment, especially heavy power users, should be plugged into other circuits. Keep backup media in a separate, protected location, preferably off-site. Computer media must be protected from static charges and magnetic fields such as those in telephones, office magnets, and other electronic equipment.

Backups

A regular backup schedule is the key to a computer disaster recovery plan. Backup can be used as a verb (to back up) or a noun (to refer to a copy of computer files). As a precaution against information loss, backing up files is so important that it must be considered when choosing computer software and hard ware. When establishing a backup schedule, the key question is how much work can the museum afford to lose? The backup frequency will depend on how often computer files are updated and how fast information is added. A full backup (a copy of everything on the computer system) can be made periodically (e.g., weekly). Incremental backup (copies of all files that have been added or amended) can be scheduled more frequently.

Active or cautious computer users may wish to maintain backups of their own files, but generally a system manager is responsible for the backup program. The process of backing up computer files on a networked system is best done during slack times such as at night or on weekends because the activity degrades computer performance and open files might not be included in the backup. The backup system can be fully automated so that a system manager need not be physically present during the process. Computer backup media exist in a number of different formats. Each type of backup medium has advantages and disadvantages based on its reliability and cost per byte of data stored. For short-term file backups the longevity of the medium is not a critical factor, but the ability to reuse it reliably in the long term is important and should be researched as part of the cost analysis conducted when selecting a system.

For long-term backup, live storage is the most reliable method, in which the data are continuously rewritten in multiple electronic or physical locations, ensuring that information is not degraded over time or due to the failure of a single piece of hardware. Live storage methods include cloud computing, in which the data are distributed across a network of servers in geographically diverse locations, or in RAIDs, in which the data are duplicated across multiple hard drives within a single server.

System Administrator

Information system managers or system administrators may interact with the museum in a variety of ways and may also be the collections information managers. Whether an outside contractor or an in-house staff member, full-time or part-time, the manager must be able to train staff, handle unforeseen problems, manage the overall system, and plays a pivotal role in the success of the system. If the system administrator is a full-time museum staff member this will be the person most likely responsible for troubleshooting, adding and deleting users on the network, and coordinating with other museum personnel for disaster preparedness and emergency backups needed during evenings, weekends, and holidays.

When selecting a system manager, museum administrators need to consider both short- and long-term costs. If a staff member is adding systems administration to an already overtaxed schedule, the short-term costs will be low because no new employees will be added, but the long-term costs can be quite high as the employee attempts to find time to get the system in running order and enable other employees to work. Contract system administrators may be cost-effective because they can be used on an as-needed basis, but scheduling these individuals may be problematic.

Ideally, a system administrator will keep a database of all of the hardware and software for each workstation within the museum. This ensures protection against problems with software licensing, crashes, theft, configuration problems, and unrelated software that tends to fill up hard drives. The system administrator may be responsible for determining whether an upgrade to the computer or network systems is needed and then coordinating the upgrade within the museum.

Managing a computer system involves not only administering hardware and software but also checking and ensuring data integrity. A systems manager should coordinate with the collections information manager to determine the percentage of a museum’s records that are within the CMS, the accuracy of the information, and how up-to-date the digital records are. These data are important for insurance purposes, management, planning, and research. The system administrator may have museum staff members spot-check the data relevant to their particular area of expertise. System administrators should work with museum staff to build redundancy checks into databases. Most important of all, a comprehensive data backup plan should be maintained and duplicate data sets kept off-site.

System Manual

A system manual can be a web page or an electronic document within the museum’s intranet, or a binder containing information on changes to the system or improvements in documentation. The system manual may include:

Information on the specific hardware and software in use (including identification and serial numbers needed for access by technical assistance departments or vendors);
The name(s) and contact information for the system manager and other technical assistance personnel;
Policies governing systems use; and
Instructions on how to use the systems.

An effective system manual provides employees with step-by-step guidance on how to use the system, how to enter the information, and how to get information out when it is needed. A system manual should not be a static document because as technology and computer needs change, the manual should reflect those changes. Many museums do not use systems manuals because they are seen as bothersome to produce, and some museums that have manuals do not keep them updated to avoid the cost in staff time that it takes. However, system manuals can save staff time by answering questions people have and by serving as a tutorial on systems use. Writing out instructions for what one already knows how to do may seem like an exercise in redundancy, but documenting procedures so that they can be followed by users will save an institution time and money in the long run. System manuals can help ensure that additions or changes to the system (such as new software) are compatible with existing hardware and software.

Integration into Museum Operations

The ideal CMS involves all museum records relating to collections. Museum objects (whether accessions or loans) should be continuously tracked once they enter an institution’s jurisdiction. Once basic information is entered, the system should be able to produce necessary receipts, loan forms, gift agreements, and so on, directly from the database. Barcoding objects may simplify object tracking while eliminating typographical errors, and the inclusion of digital images greatly enhances identification within a CMS.

In addition to tracking objects and being a repository for object data, the CMS should make data available through direct access, integrating data between multiple internal systems through an intermediary database or application programming interface, or through reports that can be exported to multiple file formats.

Online Access

Since the 1970s, museums have been sharing their collections information online. Initially, they shared information with other museums and research institutions, but with the increase in public interest in collection holdings and the relative ease with which such information could be shared on the internet, museums began to make more information available to the public.

Providing public access to collections information serves multiple purposes. It helps fulfill the museum’s mission to educate the public and prove that the objects held in public trust are used for public benefit. Sharing information with diverse groups of the public and scholars can encourage the institution to be careful with its data entry and provide the opportunity to bring oversights to the attention of museum staff. As a result, many museums are more aware of the importance of consistency in their descriptive practices, which supports basic collection stewardship.

Virtually anyone can view digital information online, so the registrar may see an increase in requests for loans and curators may see an increase in requests for additional information about collection objects. Management may become convinced that additional support for collections documentation is needed and development may look for additional funding sources for inventory and documentation. Many funding agencies (e.g., the Institute of Museum and Library Services) and private donors will fund accessibility projects that require photographing and documenting objects before their information can be placed online, and grant opportunities for collections care often have an online-access requirement.

TYPES OF ONLINE ACCESS

Museums employ a variety of methods for sharing collection information online. Today online access is synonymous with displaying collection items on the World Wide Web, either on the museum’s website, on a consortium website, or through a third-party application or web page.

Curated Online Exhibitions

Some institutions make few collection objects available for viewing on their websites but offer a curated selection of important objects with lengthy descriptions and histories. These highlights from the collection function much like gallery exhibitions do, either informed by a specific story or self-guided by the online visitor. Supporting materials, such as video, audio, or links to related texts, may be linked to the exhibition.

The curated method of sharing collections online has the advantage of providing interpretive materials to the public, serves as a way for curators to extend exhibitions to people who cannot visit the museum, and ensures that the efforts they expend to create exhibitions live on far past the usual life span of the traditional museum presentation. The downside to the curated online display is that it requires additional effort by staff.

Database-Driven Collections Online

Some museums display information from the CMS directly in a specific online collections area of their museum’s website. These database-driven pages can be extensive, displaying information about every object that has been approved for public viewing. The information displayed varies among institutions. Most museums display the accession number, title, creation date, maker, maker birth and death dates, material or medium, credit line, and an image. Some museums publish links to related materials, detailed provenance information, descriptive texts, subjects and categories, artist biographies, videos, and high-resolution images for downloading or viewing on a web browser.

Because these web pages are database-driven, any changes made in the CMS will appear automatically on the website. This may be immediate if the networking structure between the system and the web display allows it, or if administrative permissions allow immediate changes. More commonly, changes made to database records will be uploaded to the server managing the website on a nightly or weekly basis or when undertaken by an administrator. Any changes to the information should be reviewed by either the collection information manager or an editorial staff member, who should have final authority to approve the information for dissemination on the website.

Database-driven online collections may require a CMS that has a module for web-based display. Museums without such systems may hire programmers to build scripts to bridge their collections database to a website. In either case, a web designer will be needed to design the graphical layout of the web pages in which the information will be displayed. If database-driven online collection access is important to the institution (as it is to an increasing number of museums) the web-output capabilities of the system should be evaluated along with other functionalities under consideration at the time of selection. It can be helpful to view the online collections of other museums using the system for that purpose, bearing in mind that the graphical interface is often separate from the functionality of the system itself. Speaking with the staff responsible for bridging the internal system with the website can help clarify issues with the software. Another concern is that materials are presented without many explicitly defined contextual relationships, unlike curated exhibitions. Many scholars and members of the public will explore the material and make their own connections, but everyone using both mediated and unmediated displays for online collections information can be mutually supporting and enhance public education, sometimes in unexpected ways.

CONCLUSION

As demonstrated through the processes described, museum information management is an intricate process that requires dedicated effort and cooperation among staff across departments. This information is meant as a basic guide to getting up and running with a dig itized platform starting from scratch, but even this information only begins to describe what is possible. Further considerations and creative solutions in everything from report design to network security will optimize whatever system an institution chooses to implement. Remember that colleagues in other institutions are often the best resources for guidance in issues big and small and that involvement in a community of like-minded professionals is a priceless addition to the returns on these valuable systems. •

RESOURCES

Professional Organizations

The Museum Computer Network¹⁴ (MCN) promotes the development and use of computer technology in the museum community and sponsors an annual conference with workshops.
The Canadian Heritage Information Network¹⁵ (CHIN) offers a variety of services to Canadian museums, including an automated CMS, advice on documentation standards and new technology, and data dictionary standards. CHIN maintains three national databases for Canadian collections that cover humanities and natural sciences objects and archaeological sites. CHIN is primarily concerned with promoting and supporting the development of documentation standards and computerization in the Canadian museum community and to that end has published a number of guides and articles on their website.
The Collections Trust¹⁶ (previously the Museum Documentation Association) promotes the development of documentation standards in the United Kingdom. The Collection Trust publishes SPECTRUM, which outlines the procedures required to provide documentation for museum objects and collections management activities and describes the information needed to support those procedures. •