Summary and Reflections

Summary

The word information commonly refers to physical stuff such as bits, books, and other physical media, any physical thing perceived as signifying something: documents, in a broad sense. It is easy to think of information as stuff, but the example of a passport reveals how deeply embedded in social activity that stuff can be.

The growing importance of information derives from the progressive division of labor, which enabled our transition from hunters and gatherers to an increasingly complex society. We depend more and more on cooperation, which means, in practice, dependence on information. This increased dependence is not neutral, because it is used purposefully to advance the many agendas of everyone involved, acting alone, in groups, or through organizations. Strictly speaking, all associations, all societies, depend on collaboration and communication. What is meant by an information society is that the way we live has become increasingly characterized by the use of documents in many forms. Specialized, technical uses of the word “information” that are unrelated to human knowing are outside our present interests.

What is meant by an information society is that the way we live has become increasingly characterized by the use of documents in many forms.

All living creatures depend for survival on their ability to sense, to make sense, and to react appropriately. Accordingly, communication, providing some expression for others to respond to, is crucial for any collaboration. Humans are different in their exceptional ability in the use of language, the making of images, the display of objects, and the use of tools. Since prehistoric times, four kinds of information technology have become increasingly important: writing, printing, telecommunication, and copying, each fueled by successive engineering advances, including steam, electricity, photography, and, now, digital computing.

The rising tide of documents brought initiatives to organize them, the challenge of knowing what to trust, and imaginative metaphorical language to describe both problems and opportunities.

Ordinarily, documents are graphic records, usually text, created to express some meaning. However, almost anything can be made to serve as a document, such as a leek to express Welsh identity. On a semiotic view, meaning is constructed in the mind of the viewer, so any object might be perceived as signifying something and, in that sense, could be considered a document. So if we hold to the idea of documents being evidence, a wide variety of objects and actions could be regarded as being “documents” in this extended sense. Anything regarded as a document must, in addition to having a physical form, be perceived as signifying something and depend on shared understandings (“cultural codes”). Data sets are a type of document, but the infrastructure making digital data sets accessible for use over time is much less developed than for printed material. The requirements are in principle the same. Scholarly practices, infrastructure, and the field known variously as bibliography, documentation, or information science needs modernizing accordingly.

Individuals use documents in varied ways: to learn, to verify, to communicate, to record, to enjoy, and to monitor. Increasingly, our interaction with others is through messages and other documents. How we use them and what we understand from them are integral parts of our culture. We each live in small but complex worlds, and our writing, reading, and understanding all occur within cultural contexts. Even facts need to be understood in context.

The problem of discovering documents we need and of obtaining a copy when needed is handled by making descriptions and forming collections, or marking and parking. Lists are virtual collections. Search depends on assigning descriptions to documents and then matching queries to the descriptions, but describing and querying can be difficult because we draw on the language of the past and on assumptions about the future.

Naming the topics of documents varies by notation (words or codes), vocabulary control (standardized terminology), combinations for complex topics (e.g., venetian blind, blind Venetian), and fineness (how detailed). Describing is a language activity and, since language evolves, descriptions are obsolescent. Because language is cultural, descriptions of sensitive topics may well be contested.

Document descriptions (“metadata”) cover technical, administrative, and topical aspects and help us understand a document’s character and whether or not it is of interest. Descriptions are created by associating descriptive fragments, such as subject headings, with each document. Inverting this relationship, associating documents with subject headings, creates indexes, thereby supporting a second purpose: discovery of documents of any given character. Problems arise from the differences between the many different languages, both natural and artificial (codes and classifications) in use. As a result, we need links that guide us from familiar terms in a familiar language to unfamiliar terms in unfamiliar languages. In some cases, there are dual naming systems, such as place and space in geography, calendar and event in time, and formula and narrative explanation in mathematics, where the two modes can be usefully combined. An important simplifying technique is division into fundamentally different types of concepts (facets) such as who, what, when, and where. Terms of each type can be usefully linked across different languages, but these conceptually different elements are always combined together in real contexts, and there are many opportunities for taking advantage of these complex relationships.

All selection machinery—for search, discovery, filtering, and retrieval systems—can be viewed as being composed of combinations of just two primitive types: objects (data) and operations on them. There are just two kinds of operations: transforming (deriving modified versions or representations of objects) and arranging or rearranging objects (combining, dividing, sorting, ranking). These two operations, marking and parking, can be described as semantic and syntactic, respectively. More familiar terms would be description and arrangement. Selection systems can be seen as a sequence of one or the other of these two types of operation: the derivation of a modified set of objects or the creation of a different arrangement of them.

The traditional criterion in the evaluation of selection systems is relevance, a very central concept in the field. The idea is that all and only relevant items should be selected, but this simple wish is deeply problematic in several ways. Relevant could be those items wanted or needed by the inquirer, or those that will please or be most useful. However, want, need, please, and useful are not the same, and assessments will be highly subjective. Since a search is presumed to be by someone inadequately informed, assessment is likely to be unreliable. Relevance is highly situational, depending on what the inquirer already knows, and unstable because the inquirer is, or should be, actively learning. Further, the goals of all and only relevant are in conflict, because in practice one can seek to emphasize all (recall) only at the expense of only (precision) and vice versa. Relevance is problematic because documents are not merely physical. Objects are considered documents because they are regarded as evidence of some kind, and it is this subjective aspect that undermines objective, quantitative measurement.

After this summary, we can add some reflections.

The Past and the Future

In the first chapter, my passport was used to introduce the role of technology, and in chapter 2, we noted how, after prehistoric times, humans moved beyond speech, dance, gesture, and, drawing with new lines of technical development: writing, printing, telecommunications, and copying. New tools (steam, electricity, photography, and, now, digital computing) enabled an explosion of communications, records, and documents of many kinds, leading to the rise of an addition line of technical development for finding and selecting the few that are wanted at any point in time from the ever-growing flood. Much of information technology can be seen as a sustained effort to diminish the effects of separation in space and time. We can extrapolate the past and present into the future using the same components and assuming continuing improvements in technology:

Much of information technology can be seen as a sustained effort to diminish the effects of separation in space and time.

writing, a means for recording speech, is moving steadily toward the recording of everything.
printing, the multiplication of texts, is evolving into the reproducing of anything.
telecommunications, in effect the transportation of documents, becomes, with sustained improvement, effectively pervasive simultaneous interaction.
document copying, because it depends technically on the use of image analysis and enhancement, leads to more than just the making of additional copies. The logical development of document copying is document analysis and representation, including visualization and the analysis of data sets.
finding and selecting moves steadily toward connecting and relating every record with every other record in an all-embracing web.

All of the above depends on infrastructure, including legal regimes underlying commerce and intellectual property, standardized terminology in metadata, markets, subsidies, and restrictions relating to decency, privacy, security, and other cultural values. So the opportunities for mental engagement with (physical) documents is heavily framed by social forces, with both commercial and governmental organizations strongly motivated to monitor and record what we do.

With the general adoption of digital technology, the kind of technology combination seen in the coupling of photography and printing to create photolithography extends across all varieties of technical development, leading to an environment in which different genres can be woven together into a new and richer tapestry. Projecting these technologies forward, then, leads to a society characterized by ubiquitous recording, pervasive reproduction, simultaneous interaction regardless of geographical distance, more powerful analysis of records, and an absence of privacy. Increasingly, there is a shift from individuals deriving benefits from the use of documents to documentary regimes seeking to influence, control, and benefit from individuals.

Increasingly, there is a shift from individuals deriving benefits from the use of documents to documentary regimes seeking to influence, control, and benefit from individuals.

Coping: Orality, Literacy, and Documentality

If we accept that these or any similar future projections are valid, when we look backward from an imaged future back into the historical past, what do they imply about how we cope with these developments?

The first case, in which writing extends toward the recording of anything, is of interest because much has been made of the fixity of writing and how it differs from oral discussion. At a time when orality was dominant, and rhetoric, the art of discourse, was central to education, Socrates famously observed in Plato’s Phaedrus that writing was inferior to discussion because writing is inanimate. Writing cannot explain itself or answer questions or correct itself as circumstances change. However, the fixity of writing has also been seen as momentous in providing continuity and consistency across time and space and, thereby, enabling larger and more standardized forms of social organization.

Much has been made of the transition from an oral to a literate culture and, how, for example, with the ability to record what we need to remember, mental memory techniques (mnemonics) are used less. This is a simplification. First, the emphasis on orality disregards the important communicative roles of dance, music, and ritual. Second, the effect was additive. Literacy was added to and affected orality, just as digital techniques are affecting writing and speech.

There is more to documents than literacy, because the records that affect us are decreasingly read or acted upon by humans, at least not directly. Commerce and transportation, for example, now depend on communication using printed bar codes. We see them and we know what they are, but we are not able by ourselves to read or interpret them. In the emerging digital environment of bar codes, sensors, and databases, the documents that shape our lives are decreasingly readable by humans. They are decreasingly visible to the human eye.

Although people do and must increasingly use documents, in the last resort they ultimately fall back on asking for guidance from friends they trust, suggesting that is the more basic, primal action. Examples of censorship and resistance to writings can also be seen in this frame if we view, for example, Nazi book burnings as part of the Nazi desire to protect and strengthen culture, as they understood it, from the advance of modernist civilization.

Documents are increasingly machine readable for many different reasons. Electronic, machine-readable records are not humanly legible. Some kind of special rendering or visualization is necessary even for plain text. Machines are programmed to operate on them. In fact we delegate the reading of digital documents to digital technologies. We “read” them vicariously. Mostly, machines operate on them and use our instructions to derive new records from them on a vast scale that we cannot ordinarily follow. This is no longer “literacy” in any meaningful sense, but a new phase of communicating and commemorating, and some new term is needed. We might reasonably refer to a transition from a literate society to a document society, and, if we do, we should remember that the process is additive. Our document society also includes literacy and orality (and dance and drawing and other performances).

What Kind of a Field?

What kind of a field is the study of information? It should by now be clear that discourse in this field is full of figurative and conjectural language: world brain, external memory, relevance, work (as an imputed set of ideas), content, meme, community knowledge, information society, and so on. Only a living creature can know, but it is convenient to refer to documents as recorded knowledge and to machinery or an institution as knowing. This imaginative language has a useful role and is typical of changing fields, but there is also a need for it to be complemented by careful, rigorous analysis if we are to have a clear understanding of information and society.

The study of information is also conjectural. Common examples of conjecture are the use of relevance, the conjectured suitability of a document for some cognitive purpose, and work, when used abstractly for a body of intellectual or artistic achievement distinct from the physical expressions and manifestations of that achievement.

Since all manifestations of information are invariably physical, and all information systems and services are humanly made, information science is an example of what Herb Simon called the sciences of the artificial. At the same time, information, when in relation to society, is essentially cultural. The desire to be more scientific, meaning more formal and more quantitative, is often sought by excluding cultural aspects that resist formal definitions, precise measurement, and logical operations. Formal approaches to “information” are well developed and very useful for many practical purposes. Nevertheless, the restrictive foundation ensures a limited scope. In contrast, we have preferred a more realistic approach by insisting that the study of information be rooted in the process of informing, of becoming informed, of human knowing. Both approaches are valid. They are, however, different.

There is a tension between formal systems of great practical use and the knowledge that these helpful devices depend on making simplifying assumptions that do not in fact fully reflect reality. Such compromise is also true of other fields that deal with human behavior. Economics is an example: the virtuoso methods of microeconomic analysis are very powerful, but they assume a degree of rationality not characteristic of human behavior. Similar tension can be found in linguistics and other social and humanities fields. In a way, this is reassuring because it makes information science emerge as comparable to other well-developed fields of study.

It will be clear from the passport example with which we began and from all that has followed that only an approach that combines the physical, the mental, and the social aspects can be adequate for the challenge of examining the complex relationships of information and society.