What ways might libraries use blockchain technology in the future? There are many possible applications, ranging from supporting scholarly communication to credentialing, to community-based collections, to health information management.
MacKenzie Smith, University Librarian and Vice Provost of Digital Scholarship, University of California at Davis
Blockchain technology has significant implications for the future support of scholarship and scholarly communication—both in terms of the operations and the research activities supported by libraries.
The obvious applications of blockchain technology for libraries include scholarly resources, particularly metadata and digital objects. The fundamental concepts of provenance and authenticity in special collections and archives, including records management, allow the authoritative tracking of ownership and other properties of the collection. Blockchain technology could support broad access to provenance and authenticity metadata about library collections, offering a superior solution to the current fragile, labor-intensive record-keeping workflows. For example, recording sale transactions on a blockchain throughout the lifetime of an item leaves no question about its history and provenance, assuming that the transaction is either native to the blockchain or, if originating off the blockchain, recorded correctly. Recording changes made to an item on a blockchain (e.g., reformatting a digital asset for preservation or amending a database) could make its authenticity simple to verify.
A related application of blockchain is in research data curation. Current digital asset management systems have various customized methods of tracking digital asset sources and integrity, such as digital hash values to track unintended changes to digital objects. Blockchain distributed ledgers might be ideal for tracking digital objects on a large scale, as well as tracking locations, owners, stewards, and other metadata that should be reliable and traceable over time.
Similarly, applying blockchain technology to metadata about information resources, from news items to research results, might improve public trust in that information by providing new ways to evaluate sources and changes over time. For example, a website such as Climate Feedback, which is now annotation-based, could use blockchain to sign notations or criticisms by scientists using a ledger-based comment system; the signed notations and criticisms could then easily be inspected by readers to establish the credibility of the annotators.1
Blockchain platforms could support new distributed, large-scale metadata systems, obviating the need for centralized systems like OCLC’s WorldCat, Crossref, or ORCID.2 While technically possible, the advantages of reforming large-scale metadata systems are unclear, given the inefficiency of blockchain technology and the fact that improved trust and greater decentralization are not top priorities for library and research-related metadata systems. Other new technologies, such as linked data, offer equally interesting alternatives for improving efficiency and greater decentralization.
Blockchain-based financial systems could be used to purchase scholarly resources. Finally, research libraries buy scholarly resources from all over the world, in every currency, and currency fluctuations can wreak havoc on library budgets. Blockchain-based financial systems, Bitcoin for example, offer intriguing possibilities for using cryptocurrency to regain control over international financial transactions, between libraries and publishers or among libraries, potentially eliminating exchange rate uncertainty while streamlining acquisition procedures. Achieving this potential would require large-scale change by many stakeholders, but smaller scale experiments could test the idea, for example, among library consortia that support fee-based interlibrary lending.
Today, many libraries manage research data of all kinds. Research data management involves not only storing and curating digital data, but also organizing that data for discovery, providing data governance, and supporting open scientific workflows across the research life cycle. Outside of libraries, there are many efforts underway to apply blockchain technology to research workflows for improved accountability and reproducibility, such as Orvium, artifacts.ai, and protocols.io, to name just a few.3 These offer the potential for better policy compliance-monitoring by institutions and government funding agencies, and for helping to restore public trust in research. If blockchain catches on with researchers, libraries will need to adapt their data services to the technology. Ideally, libraries would be involved in the design and deployment of blockchain platforms and applications for research, and would add their expertise in archiving, digital preservation, and metadata management.
In the area of scholarly publishing and communications, there are similar efforts underway to apply blockchain to aspects of those activities, such as version tracking, peer review, and content management. If libraries continue to acquire and manage the outcomes of those activities—books, journals, websites, databases, media, and so on—and make these available to scholars over many years, then libraries need to get involved in defining how blockchain technology will be applied to the scholarly record.
While blockchains are distributed and decentralized, the existence of these new systems could lead to even stronger centralized control of information resources. A particular issue for libraries with regard to the adoption of blockchain technology in publishing and scholarly communication is the potential of blockchain to significantly tighten intellectual property controls and DRM. For example, content distribution using smart contracts on the Ethereum blockchain platform could cripple legal tools like fair use and eliminate digital first sale by creating verifiable transaction records that use licenses and transfer-tracking to limit owner rights and eliminate the possibility of rights expiration.
An area of great interest for blockchain advocates is identity management, such as giving individuals control over their personal information, rather than allowing companies like Facebook, Amazon, and Google to own this information. Identity information is stored on a blockchain using public key and private key cybersecurity protocols and is extremely secure. For researchers who currently have to give information about themselves to a plethora of digital platforms to have an online presence (ResearchGate, Google Scholar, Microsoft Academic, etc.), this notion of regaining control over which platforms get what information is very appealing. But research libraries are all too familiar with professors and students who frequently misplace things.
Another consideration is the nature of research and knowledge production. As explained earlier, blockchain technology is best suited to transactions where immutability is important; data on the blockchain cannot change. While certain aspects of research and scholarly communication are transactional and immutable, like the records of experimental research, scholarship in general is neither transactional nor immutable. Scholarly research is an evolutionary discovery process that is marked by healthy disagreement and sudden paradigm shifts. Unlike finance and purchasing, science and the humanities are messy, and the blockchain is ill-suited to complex and chaotic data.
Heather McMorrow, Level Pre-College Program Director, Northeastern University; and Amy Jiang, Library Technology Coordinator, University of La Verne
Universities and governments have been the arbiters of academic accreditation and professional certification for centuries. This has been the most functional and expeditious way to operate. However, this closed system has also meant ceding authority over our identities and narrowing what is validated as authentic knowledge at a time when the knowledge economy is only expanding.4 Decentralized applications, such as blockchain, can dramatically improve the issuance and data management of academic credentials, as well as shift the societal view of knowledge and skills verification, return identity, and academic sovereignty; improve accountability; impede corruption; and effectively knock down barriers to economic and social mobility. Decentralized systems place libraries in a unique position to innovate in support of these goals.
Three particular types of blockchain implementations (cryptocurrency, smart contracts, and systems of record) could play a role in education, impacting a range of services and experiences from university credentials to migration and mobility.5
One of the most commonly discussed applications of blockchain in higher education is for traditional university credentials. For example, MIT has implemented a Blockcert system to issue digital certificates, as well as dual paper and blockchain diplomas to students for its finance degree and its master’s degree in media arts.6 This Blockcert system gives students the option of receiving a digital diploma in addition to a paper diploma, and because the mobile app gives students unique private keys, a student can therefore prove ownership of his or her diploma. This system does not host student artifacts, but rather simply provides a unique university artifact, in this case a digital diploma, to each student. In Europe, the Open University and the University of Nicosia have also begun experimenting with Blockcerts.
Blockchain technology can also be used to recognize lifelong learning activities and to provide credentialing for continuing professional education. Governmental and nongovernmental bodies, individual institutions, and funding organizations have explicit directives to recognize alternative paths to credentialing.7
Blockchain technology has possibilities for developing economies that are disproportionately impacted by extreme poverty, institutional corruption, climate change, natural and man-made disasters, and a lack of infrastructure.8 Even in more developed countries, many people can still encounter similar challenges, depending upon their economic status and access to resources. Blockchain technology provides ways for people to retain access to important information during disruptive migration and mobility experiences.
Looking specifically at one blockchain application in higher education, credentialing, there are several pros and cons to consider. These pros and cons may vary based on the system of blockchain implementation (e.g., Ethereum, Quorum, Bitcoin), the type of consensus mechanism (e.g., proof of work, proof of stake), the level of permissions included in the credential data, and whether the system is used in a public or private sector. A partial list of pros (cost effectiveness, accuracy, immutability, and identity sovereignty) and cons (access and interoperability) need to be considered.
Some of the benefits are that the distributed nature and advanced cryptography of blockchain make it next to impossible to tamper with. The lack of a centrally held database can eliminate the cost of accessing transcripts and professional credentials. Moreover, blockchain can create automated processes that remove the potential for human error or falsification. This could solve the challenge of exaggerated qualifications and fake credentials and contribute to sovereignty over one’s identity. It could allow those who are globally mobile by choice, or by necessity (e.g., people who are unemployed or underemployed, poverty-stricken, employees working or studying internationally, displaced persons, and refugees), to have agency and access over the distribution of their personal documentation. The technology has the potential to reduce dependence on international aid and provide more accurate and authentic data collection, transparency, and accountability for organizations, governments, and regulators.
Blockchain, when used for personal identification, can also provide greater national security by ensuring the swift and accurate identification of individuals—without a central government controlling or accessing such information for nefarious purposes. This will provide a powerful counterargument to protectionist governments and growing global nationalist movements whose citizenry are fearful of accepting immigrants. It would also help ensure that children who are separated from their parents are not lost in a poorly run bureaucracy or exploited by individuals posing as parents or guardians.
However, there are some limitations to blockchain, particularly in terms of access. From reliable electricity to Internet, technology access is still economically disparate. Hardware is difficult to come by and electricity can be spotty. The issuance and acceptance of blockchain credentials as part of an application packet to college or for a job is limited.
Another issue relates to interoperability and standards. Universities run multiple platforms for information storage and sharing. It is not yet known how the distributed nature of blockchain will affect or cooperate with existing systems. There is also no global consortium, as there is with the Internet’s W3C, that establishes standards and practices or manages updates for blockchain technology.9 Most blockchain implementations have their own consortium.
Libraries need to consider how blockchain implementations would fit into the other IT systems provided to users. It is also important to coordinate blockchain efforts across libraries and institutions; one-off solutions are unlikely to succeed. One promising possibility is for libraries to use blockchain technology in credentialing in the mastery of information literacy units. Libraries could also educate community members about how blockchain technology can help them achieve self-sovereign identity.
Timothy A. Thompson, Discovery Metadata Librarian, Yale University Library
Authority control, controlled vocabularies, and the intellectual and physical control of collections—the idea of control is part of a professional ethos that tends to privilege centralization and uniformity in the organization of information. Centralization on the level of organization and representation maps to centralization on the level of maintenance and exchange: libraries rely on gatekeepers and centralized services to manage knowledge organization systems such as the LOC Subject Headings and related authority files. Ultimately, questions of control are questions of trust: centralized systems and workflows allow libraries to define boundaries within which community norms and standards can be enforced. At the same time, they are exclusionary and limit participation to a set of authorized contributors. Blockchain technology has the potential to reconfigure the relations of information exchange among libraries and to shift the locus of trust from centralized services to distributed systems built on the premise of peer-to-peer interactions.
As an illustration, consider the example of the NACO-México project, which began in 2003 as a cooperative effort among academic libraries in Mexico to contribute records to the Name Authority Cooperative Program (NACO), maintained by the Program for Cooperative Cataloging and the LOC.10 Currently, an institution that wishes to contribute to the Name Authority File must do so through one of two authorized gateways, OCLC or SkyRiver, both of which charge a membership fee.11 In 2011, participants in NACO-México were compelled to move from OCLC to SkyRiver because they could no longer afford the cost of OCLC membership. Subsequently, the project dealt with discrepancies in its contribution statistics as recorded by the LOC, which has reported contribution totals that were lower than the project’s internal numbers. If NACO itself were to run on a blockchain network rather than as a centralized service, membership fees might be eliminated or replaced by marginal transaction fees, making participation affordable for a broader range of international contributors. Discrepancies in contribution statistics could be eliminated through reference to the blockchain as a permanent, time-stamped record of transactions. Individual libraries might sell or purchase data on demand from peer contributors, thereby decreasing local workloads and creating new opportunities for international collaboration.
For several years, academic libraries have been discussing and undertaking efforts to migrate from legacy metadata formats to linked open data. This transition has been a difficult one, however. Although work developing semantic ontologies such as BIBFRAME has allowed the library community to examine and reconsider some of its fundamental data models, the implementation of linked open data in academic libraries has been impeded by the absence of an underlying computational architecture that can support new models for data-sharing and production.12 A major selling point of linked open data is its support for internationalization and integration with the wider web of data. However, without an open, distributed market for the exchange of data, centralized bottlenecks will continue to undermine attempts at systemic change.
Many information professionals have been skeptical of blockchain applications for the cultural heritage domain, echoing the standard critique of blockchains in general: that blockchain applications have been peddled as a panacea when in reality these applications are little more than inefficient databases that offer limited functionality in exchange for troubling amounts of energy (in Proof-of-Work systems) or wealth (in Proof-of-Stake systems).13 Even leading Bitcoin advocates such as Jimmy Song have argued that the constraints imposed by blockchain technology make it appropriate for only a very limited set of applications (in his view, currency and the exchange of value).14 For individuals and organizations that are investigating blockchains as a technical solution, it is important from the outset to establish a framework for evaluating their applicability and appropriateness.15 On its own terms, a blockchain is simply a means to an end, one possible approach to achieving consensus among nodes in a distributed network—and doing so even in the presence of arbitrary system failure or malicious behavior (the so-called Byzantine Generals’ Problem in computer science). The question for libraries is whether participation in decentralized networks is desirable for a given use case or can help further their core mission in ways that may not have been possible before.
For example, the problem of entity resolution, also known as record linkage or data-matching, is one that has a direct impact on the work of information professionals in academic and research libraries. In library units responsible for catalog management, many workflows center on the procedure known as copy cataloging, which aims to expedite the processing of new acquisitions. Copy cataloging involves searching a shared database for records created by another cataloging agency, but which describe identical publications that have been acquired by one’s local institution. In the current environment, the global exchange of library catalog data is controlled in large part by OCLC. Although OCLC provides data aggregation and storage services that allow libraries to share their data, this vendor-driven paradigm entails the acceptance of a business model that, in effect, charges libraries for serving their own data back to them, albeit enhanced with different forms of added value.
Libraries have a tradition of experience with data-matching and automation, but now stand to benefit from the increasingly mainstream availability of algorithms and routines developed within the context of data science and machine learning. Sophisticated algorithms for string comparison and probabilistic record linkage have long been available, but are not widely used by libraries, with notable exceptions such as the Virtual International Authority File,16 which is itself a project of OCLC. As machine learning tools and methods have become more accessible, large-scale, real-time access to library metadata has not necessarily followed suit. The catalog of a large academic library may contain several million records. By comparison, as of August 2018, the OCLC catalog database, WorldCat, contained 427,501,671 bibliographic records in 491 languages.17 As long as central hubs or service providers maintain control over the aggregated metadata of research libraries, the large-scale computational analysis and utilization of this data will remain out of reach for most.
Of course, when discussing decentralization, there are a range of new technologies that should be considered. Blockchain may or may not be the most appropriate one for a particular use case—or it may need to be used in conjunction with other technologies in order to enable decentralized exchange. Several efforts are underway to develop systems for decentralized file storage using distributed hash tables, one of the most prominent being the IPFS. In a way similar to the software versioning protocol Git, IPFS uses hash values to capture the state of a file at a particular point in time and then serves it on a peer-to-peer network. IPFS hashes might be referenced as links in blockchain transactions in order to decouple the storage layer from the accounting layer.18
Technologies and protocols for distributed systems, such as blockchains and distributed hash tables, could allow research libraries to form robust peer-to-peer networks that would enable data-sharing on both macro and micro levels. Public blockchains such as Ethereum and Bitcoin are severely limited in the amount of data that can feasibly be stored on the chain, but alternative blockchain platforms that address this limitation have recently been developed. For example, the blockchain-based database service BigchainDB offers a robust data-storage solution while ensuring Byzantine fault tolerance and providing blockchain features such as immutability and an asset-based transactional model.19 By running a “consortium blockchain” network using a system like BigchainDB, academic libraries could be empowered to move away from centralized models and begin managing their data collectively.20 Instead of paying a centralized metadata hub to distribute their catalog records, libraries could use blockchain technology to share their metadata—whether in batch or as discrete bits—on a peer-to-peer exchange. Many blockchain systems support the creation of so-called smart assets, a term used prominently by the New Economy Movement project.21 Smart assets can be modeled as nonfungible tokens that represent an object on a blockchain and allow it to be exchanged. Metadata professionals are familiar with the concept of the record or descriptive unit as a surrogate for a real object. As smart assets, metadata objects themselves could be tokenized and represented at the appropriate level of granularity, whether as linked data statements (triples) or as record-like objects. By providing an international peer-to-peer marketplace, blockchain networks could facilitate the free flow of library metadata, potentially creating new revenue streams for individual libraries and replacing some of the costly subscription services that currently predominate.
Annie Norman, State Librarian of Delaware and Director, Division of Libraries
Blockchain may provide the salvation for libraries’ data challenges. Effectively quantifying and showcasing the value of libraries, especially public libraries, has been a perpetual problem that impacts funding. Blockchain may help public libraries measure their performance and outcomes at scale. Libraries are an open book and share willingly among themselves, but the sharing is not as efficient and effective as it could be, since library services and data are segregated into silos owned by various vendors and governances. Meanwhile, businesses such as Google and Amazon are taking more aggressively innovative steps relative to the strategies currently employed by most libraries. These businesses have a single-entity advantage that enables monitoring of the entire value chain for customers and all the processes of their businesses, at scale. To leap beyond traditional data management, libraries and library vendors need to collaborate in a much more seamless and transparent way.
Currently, the Chief Officers of State Library Agencies’ “Measures That Matter” project is marshalling efforts with IMLS and other stakeholders to assess and resolve public libraries’ data challenges. New technological solutions such as a distributed ledger could provide the necessary authentication and privacy capabilities for public libraries to support services such as a universal library card, a union catalog with provenance and ownership verification, data-sharing, and more.22
Public libraries typically collect and analyze data using existing library technologies. Blockchain’s distributed nature and privacy capabilities may provide libraries with the potential for faster experimentation and more effective and seamless data collection solutions. Here are some ways that Delaware libraries are thinking about using blockchain to address their data collection and analysis needs.
Could a user-facing blockchain-based system help libraries better collect and understand data about patron needs, about what patrons are trying to do, and, ultimately, how libraries might provide services to meet those needs?
Traditional library statistics, such as totals of circulation, program attendance, and reference questions, yield data about the user in the life of the library. Libraries also need new ways of looking at data from the patrons’ viewpoint and showing how patrons are using library services while still respecting patron privacy. Delaware libraries use two macro-organizers to encompass all potential services and community needs, a Dewey/Maslow framework (see figure 2). The Dewey Decimal Classification System (x axis) is the installed base for Delaware libraries’ collections, and the same taxonomy is used to align circulation with program attendance and reference questions by subject. A modified version of Maslow’s hierarchy of needs (y axis) organizes all needs that libraries support, from basic needs to transformational ones. The Dewey/Maslow framework addresses all disciplines and subjects, includes all library and Delaware partner services, and can be used in a variety of ways for planning and assessment.23
Data alignment may be easier to manage using blockchain solutions. By strategically measuring library performance across services, it would be possible to understand how libraries contribute to improving a variety of community indicators. Consequently, libraries could become even more effective and influential in helping communities to evolve and transform.
What are the cumulative benefits of reading books, attending one-hour programs, and so on, over a year, and over a lifetime?
In Delaware, a method was developed for individuals to capture the benefits that accrue from books read, programs attended, and so on. Tips, tools, and techniques are provided, and the public is encouraged to track their reading and learning over time.24 Tracking enables members of the public to enhance their learning and quantify the benefits they have received from libraries. A blockchain alternative may help ensure secured and sustainable patron access over the long term.
What proportion of the population do libraries actually serve?
In order to truly understand library performance, the population’s use of a library should be analyzed in relation to community development. The stretch target, or capacity measure, of a library is in place to serve the entire population (or close to it), and the library landscape should be seamless across school, public, and academic libraries. Data gathered for this measure will be easier to manage with the blockchain technology noted above.
A holistic view of performance is needed to better understand the cause and effect of library services, which is what funders want to know. Transparency of inputs, outputs, outcomes, and community impact at scale across all libraries would foster a clearer understanding of library performance and ways to improve it. Library leaders are responsible for creating a transparent library infrastructure, and would benefit from working together to implement blockchain more seamlessly to provide evidence of the public library’s contributions and value to society.
Victoria Lemieux, Associate Professor of Archival Science, University of British Columbia
Among the growing interest for applying blockchain technology to the health sector is its potential application as a means of giving an individual direct control over access to her medical records and consenting to secondary use of her health data for research purposes.25
A case study, undertaken as part of the University of British Columbia’s “Records in the Chain” Project, focused on a blockchain prototype developed collaboratively by the Prevention of Organ Failure Centre of Excellence and the Deloitte accounting firm in which a member of the Record in the Chain Project was embedded as an observer.26 The purpose of the blockchain prototype was to manage users’ consent to the use of their clinical data for precision health research. The prototype was built for the purpose of making the enrollment of study participants more efficient, eliminating the need for the researcher as an intermediary between study participants and clinical sites in the consent management process, and providing study participants with greater transparency about the usage of their personal data.
The Deloitte proof of concept (PoC) used the blockchain to build a single decentralized, disintermediated system to serve as an interface between participants, researchers, and hospitals. The system allows participants to enroll and consent to the use of their personal data through a web portal, and access time-stamped audit trails of their interactions with the system. This system allows researchers to create studies and invite participants to contribute personal data to them, and also allows researchers to request patient data from other institutions within the system. The data-sharing user journey (see figure 3) shows how the system integrates and coordinates the steps and participants in the previously manual process of researchers requesting data from other institutions.
The solution used a Nuco Ethereum private blockchain, which is an extension of Ethereum.27 Ethereum is an open-source blockchain protocol suite that was originally designed as an alternative to the Bitcoin blockchain platform. The solution stack incorporated a custom front-end graphical user interface and an Amazon S3 file server, with Amazon Web Services. Smart contracts developed in Solidity controlled the workflow.28
Blockchain-based user consent management of health data has the potential to solve inefficiencies in the health data consent management process, and this could enable more rapid advances in precision medicine to the benefit of all. It can also support more transparency for end users about the consent process and how their personal data are being used.
On the other hand, reliance on blockchain technology for this purpose also raises several critical issues. The first of these is privacy and the protection of personally identifiable information (PII). A number of jurisdictions have passed legislation regulating the processing of personal data, based on the principles of lawfulness, fairness, transparency, purpose limitation, data minimization, accuracy, storage limitation, integrity, and confidentiality.29 Beyond these laws, there are a number of principles of privacy protection and the handling of PII that should be respected, such as those established by the International Association of Privacy Professionals’ Principles of Fair Practice.30
How these laws and principles and their specific implementations can be addressed in blockchain solutions is an open question, as yet. It is unclear, for example, how PII that has been stored on a blockchain would be removed if an individual were to make the request under the “right to be forgotten” provisions of the GDPR. With data breaches on the rise globally, and many of these targeting health records, security is also a major concern. Blockchains, like all digital record-keeping systems, rely upon software and computing technology that can have vulnerabilities due to poorly written code, poor system design, backdoors, and so on, which means that, despite the use of encryption in blockchain systems, information could be at risk.31
Another area of concern is usability. While it may be true that users want more control over the custody of their health data and more transparency when it comes to consent to the use of their data, the added complexity of using an unfamiliar and emerging technology may be asking too much of end users. Some blockchain systems rely on users to remember their private keys and keep them secure, and have no key recovery service in the event that a private key is lost. There have been several reported cases of even very sophisticated users of the Bitcoin blockchain losing their private keys and, consequently, access to thousands of dollars’ worth of cryptocurrency.32 What might happen if users lose the private keys that control access to their health data?
Another issue is the legal status of smart contracts. See Blackaby.
These are but a few of the potential issues that may arise in blockchain technology in health information management. Though challenging, there are potential solutions or ways to mitigate risks, as discussed in the following subsection.
Steps that might be taken to mitigate the downside of using blockchains to manage health information include:
Frank Cervone, Executive Director for Information Services, School of Public Health, University of Illinois at Chicago
Within the realm of health information management, both proposed and currently operational blockchain projects provide clues for useful applications in libraries and information organizations. This should not be too surprising, given the close historical relationship between medical information management and librarianship. Within the health care field, blockchain continues to rapidly move from theoretical discussions to specific applications impacting areas ranging from pharmacology and medical device supply chains to the recruitment of patients for clinical trials, and to improving the security and interoperability of medical devices. In the realm of health care, blockchain is now a reality.
Applying blockchain technology to health information management is now possible because of the standards for medical information interoperability, which are conceptually equivalent to those in the library and information science fields. For example, Fast Healthcare Interoperability Resources (FHIR) is a developing standard that defines data formats and elements, along with providing publicly accessible application programming interfaces for the purpose of exchanging electronic health records (EHR). This standard is the health care equivalent of a completely normalized machine-readable cataloging system with predefined access interfaces. FHIR offers the potential to extend EHRs outside the constructs of traditional electronic health care systems to mobile and cloud-based applications, medical device integration, and flexible or customized health care workflows.35
Several current, real-world applications of blockchain technology in health information management are providing baseline functionality. Provider directory services is a joint project between Optum, Quest, and Humana to provide common, distributed health plan provider directories. This is a mega-directory of health care professionals within specific health care systems and a good example of an application that is trying to address the problem of managing multiple sources of truth through reconciliation between those sources.36 Validation of patient identities is a project by the government of Estonia to create a blockchain-based framework to validate patient identities.37 All citizens are issued a smartcard, which links each individual’s EHR data with his or her blockchain-based identity. In light of recent concerns about scheduling fraud at the U.S. Department of Veterans Affairs and the risk for data manipulation of implantable medical devices, such as pacemakers, this type of system has several potential benefits to guarantee that any modifications to an individual’s health care record are secure and auditable.
MedRec is a project between the MIT Media Lab and the Beth Israel Deaconess Medical Center that provides a decentralized platform for managing permissions, authorization, and data-sharing between health care systems.38 A Data Provenance Toolkit is being developed by RAIN Live Oak technology to support the creation and validation of provenance records.39 While focused on health technology, this application could be useful in a number of other settings, including library and information science.
A major benefit of blockchain for EHRs is that an immutable audit trail guarantees the integrity and provenance of data. Once a transaction is committed, it cannot be changed, which guarantees the integrity of the transaction data as well as the provenance of the data. Records are signed by the source, which allows the legitimacy of records to be verified and false records to be plausibly denied. Furthermore, security and privacy are increased because data is encrypted in the blockchain and can only be decrypted using the patient’s private decryption key. Therefore, even if the network is infiltrated by a malicious party, there is no feasible way for them to read patient data.
Blockchain presents both benefits and challenges in health care. For example, blockchain technology will allow health data to be collected over the life span of an individual. The benefit to the patient is a complete medical record which accurately details that individual’s entire health history. A patient could then make this data available to any health care provider, and individual health outcomes should be improved. Furthermore, the agglomerated data could also better support research as a rich base of information to be analyzed in order to help predict future health concerns at the population level.
A critical factor in this approach is that the patient would have control over who has access to his data. The patient could then provide access at varying levels, ranging from full access to all of his health records, to just a limited subset based on the application.40 For example, a patient visiting his dentist could then restrict his health data to just those aspects that would be relevant for dental practice. Similarly, patients participating in research, such as drug trials, can limit the amount and type of their health data that is made available to researchers, as well as what personal information the researchers can subsequently share with others. This same type of disclosure scheme could be applied within bibliographic systems to allow the patron to control his personal information as appropriate to the context, such as opting out of sharing circulation or electronic journal access information from recommender systems.
Currently, much work is being done on creating an environment where a common database of patient information could be built using blockchain technology. This same idea could be applied to libraries and information organizations through the use of a common database of user information, allowing for a common patron database that would enable universal access across library systems.
Clearly, the basic architectural features of blockchain, such as the immutability of transaction logs, are of benefit to EHRs. This same functionality of blockchain applies to libraries as well, such as the ability to demonstrate provenance; however, this also poses issues where data needs to be forgotten, such as with circulation and access information. Nonetheless, current applications like health data exchanges can be examples of creative initiatives to share common data, such as using health data reporting models as a basis for a common library reporting system to gather data for advocacy and efficacy initiatives.
John Bracken, Executive Director, Digital Public Library of America; and Michael Della Bitta, Director of Technology, Digital Public Library of America
As various entities across sectors begin to explore the uses and implications of blockchain technology, libraries have a critical role to play. First, the public trust in libraries, which is sadly diminishing among so many other civic institutions, advances the public good by building and advocating for new tools, services, and approaches. Second, the inherent networked and cryptographically authenticated nature of the blockchain, and its reliance on open source, plays to the library’s strength as a nurturer of community. Third, the skills of the library’s information technologists who build systems enable libraries to build and contribute to blockchain technology, placing the library in a position that guides the crafting of blockchains for the public good.
Libraries are still in the early days of distributed digital media, and thus in the first moments of being able to imagine the possibilities of blockchain technology. In this period of exploration and experimentation, one of the library’s challenges is to look beyond the blockchain’s function as a medium of data storage.
Instead, information professionals need to interrogate how the distributed network of disparate parties with common interests that powers a blockchain system can also be employed for the common good. By creating networks in which libraries and members of their communities—or civic society more broadly—can work together to record and verify knowledge, there is the potential to build new, dynamic, trusted sources for public information.
The move to digital-first media and platforms makes verifiable archives difficult because it is very easy to introduce alterations into the archived data. As more and more essential communications and publications are born-digital, it is incumbent that archival practices scale to match the new problems these media and platforms introduce.
It would be easy to assume that the practices that grew out of the movement to archive digitally captured physical media are applicable to born-digital media. However, the landscape of born-digital media is changing rapidly, and with these changes come new forms of attack. There are established practices for using cryptographic signatures to ensure against the authoritarian tampering of archival materials, but there are problems when this approach happens under a single business domain.
In contrast to the beginning days of the Internet, storage and bandwidth are now extraordinarily inexpensive. The cost to store a gigabyte has fallen from the near-million-dollar range in the early 1980s to single-digit cents in this decade.41 And the costs to transmit a gigabyte of data have also declined proportionally. Along with these declining costs, the variety of channels used to store files has increased.
The fixity of digitally captured physical objects was originally conceived of as a way to protect against file corruption, and not one to counter deliberate forgery or attack by a hostile party. Since digital surrogates are commonly backstopped by a physical copy of the object, it would be easy to expose a campaign to forge digital copies of an artifact by referring back to the original. However, in the case of a born-digital object, it is hard to identify an original using intrinsic characteristics.
Given these realities, one attack on born-digital content archives would flood channels with subtly changed copies of the archival content. Since it is hard to authenticate which copy of the file is “original,” and traditional fixity measures are maintained by the same administration that houses the content, it would be impossible to determine which copy is real beyond relying on the reputation of the archival institution.
In contrast, a digital archival system that records auditable, independent observations of the source material to prove that they are copies of the source eliminates the problems of objects being manipulated during or after ingest. Blockchains are an ideal means for recording these observations because they are decentralized and cryptographically secure, which protects against tampering by a single institution and provides an audit trail. There is no need for trust in a third-party authority.
Blockchains help establish trust by sidestepping organizational reputation issues through the use of overlapping data structures, decentralization, and cryptography. However, a cornerstone of this approach is that the information being independently verified must be observable by multiple disparate parties. This is possible in cases where the archival material is public or at least accessible by different types of institutions.
Unfortunately, not all archives work in this manner. Records are routinely archived and only made available for viewing at a later date. Using a blockchain solution in this situation would require that those records at least be shared with independent third parties. Distributed trust would be weakened in this scenario, since the authority to select which of these third parties are allowed to verify the archival material could be viewed as self-serving. An overarching system of selecting parties for independent verification could be established, but that creates other trust issues. Also, storing the archival media on the blockchain may be infeasible, given bandwidth and storage considerations. However, a system that stores the archival content in a separate storage medium, and merely records the fixity information on the blockchain, will scale more easily.
Ideally, a proof-of-concept implementation of this idea would be undertaken by a small number of institutions that have the means to build shared library technological infrastructure and could create a sponsoring organization to author standards, evangelize the new technology, and shepherd its growth among institutions that are interested in decentralized archiving. The Digital Public Library of America (DPLA), which was founded as an open, distributed network, can help to facilitate these connections and conversations.42 In fact, as information professionals explore this new topic together, the process that led to the founding of the DPLA could provide lessons. The DPLA brought together diverse stakeholders inspired by an idea, and through an intentional, national collaborative process, they iterated on that concept to build a plan of action.
M. Ryan Hess, Digital Initiatives Manager, Palo Alto City Library
The blockchain opens up the possibility of extending library-like services beyond the library walls. In leveraging blockchain’s facility for establishing trust between total strangers, libraries could deploy a community sharing platform, wherein community members can exchange resources. These resources could be anything from books, objects, and tools to services and know-how. The blockchain would serve as a ledger of transactions between individuals, with smart contracts utilized to govern loan periods and other borrowing policies. This community-based collection solution could empower individuals to participate in a peer-to-peer lending service.
The community-based collections platform could be composed of five major elements: the blockchain ledger, smart contracts, application, participants, and collections.
The blockchain ledger can be used for a number of critical services, including providing an inventory of resources that participants have made available, a record of transfers of resources between participants, and a balance sheet of participant accounts.
A programmatic layer, or smart contract, will be required to govern policies. Policies might include setting borrowing periods, defining borrowing privileges and limitations, fines, due dates, hold limits, and so on. The smart contract functions like an integrated library system, establishing rules for using collections. But it differs in that participants would be able to create contracts with rules that meet their needs.
Participants will require a user interface to interact with the community-based collections. There are two ways to provide this interface: a distributed application (DApp) or a standard mobile app. This app would query the blockchain ledger to enable any number of activities. It would also interact with the smart contract to restrict and guide usage of the collections via the app. Features of the app might include:
Participants can be both individuals and institutions with resources to share. Participants will form networks on the platform as defined by policies in the smart contract. For example,
Both individuals and institutions can operate within multiple networks (e.g., a student may simultaneously participate in a sharing network with his school, local library, and social circle).
Resources within the community-based collections can include anything that could be assigned a unique identifier, or what is known as a hash. Resources can include objects like tools, books, art, musical instruments, and technology. Resources might also include services and skills such as expertise, training, or advice, as long as the service in question can be tracked and authenticated. For this reason, community-based collections may require auxiliary credentialing blockchains that could vouch for the quality of the service before they can be reliably incorporated.
There are a number of ways to build the community-based collections platform. Some potential components are already available. Others are emerging.
Ethereum is a versatile blockchain technology that could provide the blockchain ledger for tracking the various transactions and provenance of items. Even more importantly, Ethereum includes smart contract functionality, allowing for programmable rules to govern accounts, networks, and collection usage.
IPFS is an emerging peer-to-peer Internet protocol that allows computers to exchange data without requiring centralized web servers to host that data. IPFS makes a truly distributed web possible, including the deployment of DApps made available by peers on the IPFS network. Like the blockchain, DApps behave more like a shared operating system, encoded into the Internet itself and available to anyone.
The front-end code of community-based collections could be made available as a distributed app via IPFS. This is not a requirement, but it makes sense in many ways. For example, using IPFS negates the need for any one organization to maintain expensive servers to keep the code online. Instead, the app would be made available by numerous peers. These peers could be libraries, but also supporters of the library in the community.
Emerging blockchain-based identity management platforms are in the works. Examples include uPort, Hyperledger, and Civic.43 The community-based collections might be able to leverage these blockchains to define who has access to a sharing network. Such a platform might also be a good candidate for validating skills or credentials, as well as providing proof of service quality.
A working group of blockchain experts, programmers, and librarians should convene to consider the feasibility of community-based collections. The focus of these groups would be to answer the outstanding questions, resolve technical issues, and determine whether or not to proceed.
Assuming the working group green-lights further work, a new project team should be assembled to develop a prototype. The first stage for this group would be to develop a project plan for building a prototype. Included in the plan would be technical and functional requirements. The plan would also need to include a budget for developing the prototype. The second stage would include obtaining a new grant to build the prototype and test it with users. The testing will ensure the security and usability of the app.
Once testing is complete, a new iteration of the app can be built using another round of grant funding. This app would then be refined until it was ready for deployment as a public beta.
The blockchain suffers from some notable limitations which pose challenges to a community-based collections platform. Two major concerns are the energy costs of maintaining the blockchain, and the slowness of validating transactions. While future innovations will likely reduce these obstacles, we must ask ourselves if such a system warrants the effort. Indeed, centralized applications might be able to provide a similar sharing service, without blockchain’s limitations.
Conceptually, centralized versions of a community-based collection platform are possible using existing, non-blockchain technology. Integrated library systems are one example in which a centralized server governs the usage of library-controlled resources. It is conceivable that additional features could be developed to incorporate community-based resources into an ILS. Centralized sharing services like Uber, Airbnb, and Bird Scooters already provide the basis for peer-to-peer sharing networks.
However, the killer app of a blockchain-based sharing network is its ubiquity. Such a DApp would essentially embed the library service into the fabric of the Internet. Like viral code, once released onto the Internet, such a platform could be utilized freely by anyone to begin sharing with anyone else. Unlike a library which must maintain funding and support to operate, the community-based collections platform would require only a network of participants to ensure its existence.
Moreover, the flexibility of smart contracts means that the platform can be deployed for endless use cases. Unlike a centralized application where an institution or company writes the rules, on a community-based collection platform, users could write rules that meet their specific needs.
Assuming that the technical issues related to the blockchain can be overcome, the community-based collection platform would be a killer app for resource-sharing. Put another way, the blockchain platform has the potential of infecting the Internet with library values.
Much work remains to realize this vision, and there are a number of key questions that must be answered before implementing a blockchain project:
Therefore, this project requires a long-term, funded enterprise that can turn vision into reality.
3. https://orvium.io/; “Building the Ledger of Record for Research,” ARTiFACTS, artifacts.ai; and “Make Your Science More Reproducible,” protocols.io, https://www.protocols.io/.
4. Ibid.
6. https://medium.com/mit-media-lab/what-we-learned-from-designing-an-academic-certificates-system-on-the-blockchain-34ba5874f196.
7. https://ec.europa.eu/ploteus/en/content/descriptors-page; http://education.ohio.gov/Topics/Quality-School-Choice/Credit-Flexibility-Plan; and https://www.nzqa.govt.nz/assets/About-us/Publications/Brochures/introducing-nzqa.pdf.
8. “Global Warming of 1.5°C: Summary for Policymakers,” Intergovernmental Panel on Climate Change, 2018, http://report.ipcc.ch/sr15/pdf/sr15_spm_final.pdf.
10. Julia Margarita Martínez Saldaña, “Informe sobre las actividades del proyecto NACO-México,” PowerPoint presentation, Sites/Cites, Texts, and Voices in Critical Librarianship: Decolonizing Libraries and Archives, Seminar on the Acquisition of Latin American Library Materials (SALALM) 63, Mexico City, Mexico, July 2, 2018.
13. https://web.archive.org/web/20180812122624/https://go-to-hellman.blogspot.com/2018/06/the-vast-potential-for-blockchain-in.html.
14. https://web.archive.org/web/20181031220327/; https://medium.com/@jimmysong/why-blockchain-is-hard-60416ea4c5c.
15. Brian A. Scriber, “A Framework for Determining Blockchain Applicability,” IEEE Software 35 (July/August 2018): 70–77.
17. https://web.archive.org/web/20181029213224/https://www.oclc.org/en/worldcat/inside-worldcat.html.
18. https://web.archive.org/web/20181010171040/https://medium.com/@mycoralhealth/learn-to-securely-share-files-on-the-blockchain-with-ipfs-219ee47df54c.
20. https://web.archive.org/web/20181027023321/https://blog.ethereum.org/2015/08/07/on-public-and-private-blockchains/.
22. https://measuresthatmatter.net/summary-of-the-measures-that-matter-implementation-group-meeting-november-28-29-2018/.
26. D. Hofman, C. Shannon, B. McManus, K. Lam, S. Assadian, R. Ng, V. Lemieux, “Building Trust & Protecting Privacy: Analyzing Evidentiary Quality in a Blockchain Proof-of-Concept for Health Research Data Consent Management,” in 2018 IEEE Conference on Internet of Things, Green Computing and Communications, Cyber, Physical and Social Computing, Smart Data, Blockchain, Computer and Information Technology, Congress on Cybermatics. Presented at the iThings / GreenCom / CPSCom / SmartData / Blockchain / CIT / Cybermatics (2018): 1650–56.
28. https://medium.com/coinmonks/introduction-to-solidity-programming-and-smart-contracts-for-complete-beginners-eb46472058cf.
29. L.-D. Ibáñez, K. O’Hara, and E. Simperl, “On Blockchains and the General Data Protection Regulation,” presented at the EU Blockchain Observatory and Forum: Workshop on GDPR, Data Policy and Compliance, University of Amsterdam, 2018.
30. https://ischoolblogs.sjsu.edu/blockchains/blockchain-chock-full-of-problems-for-medical-data-privacy-by-jessica-berger-mlis-cipm/.
31. I.-C. Lin and T.-C. Liao, “A Survey of Blockchain Security Issues and Challenges,” IJ Network Security 19, no. 5 (2017): 653–59.
32. S. Eskandari, J. Clark, D. Barrera, and E. Stobert, “A First Look at the Usability of Bitcoin Key Management,” 2018, arXiv preprint, https://arxiv.org/abs/1802.04351.
34. ARMA International, “Generally Accepted Recordkeeping Principles,” https://www.arma.org/docs/bookstore/theprinciplesmaturitymodel.pdf?sfvrsn=2; ISO 15489-1:2016 – Information and Documentation – Records Management–Part I: General, ISO, Geneva.
36. https://www.forbes.com/sites/brucejapsen/2018/04/02/unitedhealths-optum-and-humana-in-blockchain-deal-to-improve-doctor-directories/#528b59d13998.