Creating and Using Shared Libraries—A First Pass

To begin understanding how shared libraries operate, we look at the minimum sequence of steps required to build and use a shared library. For the moment, we’ll ignore the convention that is normally used to name shared library files. This convention, described in Shared Library Versions and Naming Conventions, allows programs to automatically load the most up-to-date version of the libraries they require, and also allows multiple incompatible versions (so-called major versions) of a library to coexist peacefully.

In this chapter, we concern ourselves only with Executable and Linking Format (ELF) shared libraries, since ELF is the format employed for executables and shared libraries in modern versions of Linux, as well as in many other UNIX implementations.

Note

ELF supersedes the older a.out and COFF formats.

Creating a Shared Library

In order to build a shared version of the static library we created earlier, we perform the following steps:

$ gcc -g -c -fPIC -Wall mod1.c mod2.c mod3.c
$ gcc -g -shared -o libfoo.so mod1.o mod2.o mod3.o

The first of these commands creates the three object modules that are to be put into the library. (We explain the cc -fPIC option in the next section.) The cc -shared command creates a shared library containing the three object modules.

By convention, shared libraries have the prefix lib and the suffix .so (for shared object).

In our examples, we use the gcc command, rather than the equivalent cc command, to emphasize that the command-line options we are using to create shared libraries are compiler-dependent. Using a different C compiler on another UNIX implementation will probably require different options.

Note that it is possible to compile the source files and create the shared library in a single command:

$ gcc -g -fPIC -Wall mod1.c mod2.c mod3.c -shared -o libfoo.so

However, to clearly distinguish the compilation and library building steps, we’ll write the two as separate commands in the examples shown in this chapter.

Unlike static libraries, it is not possible to add or remove individual object modules from a previously built shared library. As with normal executables, the object files within a shared library no longer maintain distinct identities.

Position-Independent Code

The cc -fPIC option specifies that the compiler should generate position-independent code. This changes the way that the compiler generates code for operations such as accessing global, static, and external variables; accessing string constants; and taking the addresses of functions. These changes allow the code to be located at any virtual address at run time. This is necessary for shared libraries, since there is no way of knowing at link time where the shared library code will be located in memory. (The run-time memory location of a shared library depends on various factors, such as the amount of memory already taken up by the program that is loading the library and which other shared libraries the program has already loaded.)

On Linux/x86-32, it is possible to create a shared library using modules compiled without the -fPIC option. However, doing so loses some of the benefits of shared libraries, since pages of program text containing position-dependent memory references are not shared across processes. On some architectures, it is impossible to build shared libraries without the -fPIC option.

In order to determine whether an existing object file has been compiled with the -fPIC option, we can check for the presence of the name _GLOBAL_OFFSET_TABLE_ in the object file’s symbol table, using either of the following commands:

$ nm mod1.o | grep _GLOBAL_OFFSET_TABLE_
$ readelf -s mod1.o | grep _GLOBAL_OFFSET_TABLE_

Conversely, if either of the following equivalent commands yields any output, then the specified shared library includes at least one object module that was not compiled with -fPIC:

$ objdump --all-headers libfoo.so | grep TEXTREL
$ readelf -d libfoo.so | grep TEXTREL

The string TEXTREL indicates the presence of an object module whose text segment contains a reference that requires run-time relocation.

We say more about the nm, readelf, and objdump commands in Section 41.5.

Using a Shared Library

In order to use a shared library, two steps must occur that are not required for programs that use static libraries:

Since the executable file no longer contains copies of the object files that it requires, it must have some mechanism for identifying the shared library that it needs at run time. This is done by embedding the name of the shared library inside the executable during the link phase. (In ELF parlance, the library dependency is recorded in a DT_NEEDED tag in the executable.) The list of all of a program’s shared library dependencies is referred to as its dynamic dependency list.
At run time, there must be some mechanism for resolving the embedded library name—that is, for finding the shared library file corresponding to the name specified in the executable file—and then loading the library into memory, if it is not already present.

Embedding the name of the library inside the executable happens automatically when we link our program with a shared library:

$ gcc -g -Wall -o prog prog.c libfoo.so

If we now attempt to run our program, we receive the following error message:

$ ./prog
./prog: error in loading shared libraries: libfoo.so: cannot
open shared object file: No such file or directory

This brings us to the second required step: dynamic linking, which is the task of resolving the embedded library name at run time. This task is performed by the dynamic linker (also called the dynamic linking loader or the run-time linker). The dynamic linker is itself a shared library, named /lib/ld-linux.so.2, which is employed by every ELF executable that uses shared libraries.

$ LD_LIBRARY_PATH=. ./prog
Called mod1-x1
Called mod2-x2

The (bash, Korn, and Bourne) shell syntax used in the above command creates an environment variable definition within the process executing prog. This definition tells the dynamic linker to search for shared libraries in ., the current working directory.

Note

An empty directory specification in the LD_LIBRARY_PATH list (e.g., the middle specification in dirx::diry) is equivalent to ., the current working directory (but note that setting LD_LIBRARY_PATH to an empty string does not achieve the same result). We avoid this usage (SUSv3 discourages the corresponding usage in the PATH environment variable).

Static linking and dynamic linking contrasted

Commonly, the term linking is used to describe the use of the linker, ld, to combine one or more compiled object files into a single executable file. Sometimes, the term static linking is used to distinguish this step from dynamic linking, the run-time loading of the shared libraries used by an executable. (Static linking is sometimes also referred to as link editing, and a static linker such as ld is sometimes referred to as a link editor.) Every program—including those that use shared libraries—goes through a static-linking phase. At run time, a program that employs shared libraries additionally undergoes dynamic linking.

The Shared Library Soname

In the example presented so far, the name that was embedded in the executable and sought by the dynamic linker at run time was the actual name of the shared library file. This is referred to as the library’s real name. However, it is possible—in fact, usual—to create a shared library with a kind of alias, called a soname (the DT_SONAME tag in ELF parlance).

If a shared library has a soname, then, during static linking, the soname is embedded in the executable file instead of the real name, and subsequently used by the dynamic linker when searching for the library at run time. The purpose of the soname is to provide a level of indirection that permits an executable to use, at run time, a version of the shared library that is different from (but compatible with) the library against which it was linked.

In Shared Library Versions and Naming Conventions, we’ll look at the conventions used for the shared library real name and soname. For now, we show a simplified example to demonstrate the principles.

The first step in using a soname is to specify it when the shared library is created:

$ gcc -g -c -fPIC -Wall mod1.c mod2.c mod3.c
$ gcc -g -shared -Wl,-soname,libbar.so -o libfoo.so mod1.o mod2.o mod3.o

The -Wl,-soname,libbar.so option is an instruction to the linker to mark the shared library libfoo.so with the soname libbar.so.

If we want to determine the soname of an existing shared library, we can use either of the following commands:

$ objdump -p libfoo.so | grep SONAME
  SONAME      libbar.so
$ readelf -d libfoo.so | grep SONAME
 0x0000000e (SONAME)      Library soname: [libbar.so]

Having created a shared library with a soname, we then create the executable as usual:

$ gcc -g -Wall -o prog prog.c libfoo.so

However, this time, the linker detects that the library libfoo.so contains the soname libbar.so and embeds the latter name inside the executable.

Now when we attempt to run the program, this is what we see:

$ LD_LIBRARY_PATH=. ./prog
prog: error in loading shared libraries: libbar.so: cannot open
shared object file: No such file or directory

The problem here is that the dynamic linker can’t find anything named libbar.so. When using a soname, one further step is required: we must create a symbolic link from the soname to the real name of the library. This symbolic link must be created in one of the directories searched by the dynamic linker. Thus, we could run our program as follows:

$ ln -s libfoo.so libbar.so
         Create soname symbolic link in current directory
$ LD_LIBRARY_PATH=. ./prog
Called mod1-x1
Called mod2-x2

Figure 41-1 shows the compilation and linking steps involved in producing a shared library with an embedded soname, linking a program against that shared library, and creating the soname symbolic link needed to run the program.

Figure 41-1. Creating a shared library and linking a program against it

Figure 41-2 shows the steps that occur when the program created in Figure 41-1 is loaded into memory in preparation for execution.

Note

To find out which shared libraries a process is currently using, we can list the contents of the corresponding Linux-specific /proc/PID/maps file (Location of Shared Memory in Virtual Memory).

Figure 41-2. Execution of a program that loads a shared library