To begin understanding how shared libraries operate, we look at the minimum sequence of steps required to build and use a shared library. For the moment, we’ll ignore the convention that is normally used to name shared library files. This convention, described in Shared Library Versions and Naming Conventions, allows programs to automatically load the most up-to-date version of the libraries they require, and also allows multiple incompatible versions (so-called major versions) of a library to coexist peacefully.
In this chapter, we concern ourselves only with Executable and Linking Format (ELF) shared libraries, since ELF is the format employed for executables and shared libraries in modern versions of Linux, as well as in many other UNIX implementations.
ELF supersedes the older a.out and COFF formats.
In order to build a shared version of the static library we created earlier, we perform the following steps:
$gcc -g -c -fPIC -Wall mod1.c mod2.c mod3.c
$gcc -g -shared -o libfoo.so mod1.o mod2.o mod3.o
The first of these commands creates the three object modules that are to be put into the library. (We explain the cc -fPIC option in the next section.) The cc -shared command creates a shared library containing the three object modules.
By convention, shared libraries have the prefix lib
and the suffix .so
(for shared object).
In our examples, we use the gcc command, rather than the equivalent cc command, to emphasize that the command-line options we are using to create shared libraries are compiler-dependent. Using a different C compiler on another UNIX implementation will probably require different options.
Note that it is possible to compile the source files and create the shared library in a single command:
$ gcc -g -fPIC -Wall mod1.c mod2.c mod3.c -shared -o libfoo.so
However, to clearly distinguish the compilation and library building steps, we’ll write the two as separate commands in the examples shown in this chapter.
Unlike static libraries, it is not possible to add or remove individual object modules from a previously built shared library. As with normal executables, the object files within a shared library no longer maintain distinct identities.
The cc -fPIC option specifies that the compiler should generate position-independent code. This changes the way that the compiler generates code for operations such as accessing global, static, and external variables; accessing string constants; and taking the addresses of functions. These changes allow the code to be located at any virtual address at run time. This is necessary for shared libraries, since there is no way of knowing at link time where the shared library code will be located in memory. (The run-time memory location of a shared library depends on various factors, such as the amount of memory already taken up by the program that is loading the library and which other shared libraries the program has already loaded.)
On Linux/x86-32, it is possible to create a shared library using modules compiled without the -fPIC option. However, doing so loses some of the benefits of shared libraries, since pages of program text containing position-dependent memory references are not shared across processes. On some architectures, it is impossible to build shared libraries without the -fPIC option.
In order to determine whether an existing object file has been compiled with the -fPIC option, we can check for the presence of the name _GLOBAL_OFFSET_TABLE_
in the object file’s symbol table, using either of the following commands:
$nm mod1.o | grep _GLOBAL_OFFSET_TABLE_
$readelf -s mod1.o | grep _GLOBAL_OFFSET_TABLE_
Conversely, if either of the following equivalent commands yields any output, then the specified shared library includes at least one object module that was not compiled with -fPIC:
$objdump --all-headers libfoo.so | grep TEXTREL
$readelf -d libfoo.so | grep TEXTREL
The string TEXTREL
indicates the presence of an object module whose text segment contains a reference that requires run-time relocation.
We say more about the nm, readelf, and objdump commands in Section 41.5.
In order to use a shared library, two steps must occur that are not required for programs that use static libraries:
Since the executable file no longer contains copies of the object files that it requires, it must have some mechanism for identifying the shared library that it needs at run time. This is done by embedding the name of the shared library inside the executable during the link phase. (In ELF parlance, the library dependency is recorded in a DT_NEEDED
tag in the executable.) The list of all of a program’s shared library dependencies is referred to as its dynamic dependency list.
At run time, there must be some mechanism for resolving the embedded library name—that is, for finding the shared library file corresponding to the name specified in the executable file—and then loading the library into memory, if it is not already present.
Embedding the name of the library inside the executable happens automatically when we link our program with a shared library:
$ gcc -g -Wall -o prog prog.c libfoo.so
If we now attempt to run our program, we receive the following error message:
$ ./prog
./prog: error in loading shared libraries: libfoo.so: cannot
open shared object file: No such file or directory
This brings us to the second required step: dynamic linking, which is the task of resolving the embedded library name at run time. This task is performed by the dynamic linker (also called the dynamic linking loader or the run-time linker). The dynamic linker is itself a shared library, named /lib/ld-linux.so.2
, which is employed by every ELF executable that uses shared libraries.
The pathname /lib/ld-linux.so.2
is normally a symbolic link pointing to the dynamic linker executable file. This file has the name ld-
version.so
, where version is the glibc version installed on the system—for example, ld-2.11.so
. The pathname of the dynamic linker differs on some architectures. For example, on IA-64, the dynamic linker symbolic link is named /lib/ld-linux-ia64.so.2
.
The dynamic linker examines the list of shared libraries required by a program and uses a set of predefined rules in order to find the library files in the file system. Some of these rules specify a set of standard directories in which shared libraries normally reside. For example, many shared libraries reside in /lib
and /usr/lib
. The error message above occurs because our library resides in the current working directory, which is not part of the standard list searched by the dynamic linker.
Some architectures (e.g., zSeries, PowerPC64, and x86-64) support execution of both 32-bit and 64-bit programs. On such systems, the 32-bit libraries reside in */lib
subdirectories, and the 64-bit libraries reside in */lib64
subdirectories.
One way of informing the dynamic linker that a shared library resides in a nonstandard directory is to specify that directory as part of a colon-separated list of directories in the LD_LIBRARY_PATH
environment variable. (Semicolons can also be used to separate the directories, in which case the list must be quoted to prevent the shell from interpreting the semicolons.) If LD_LIBRARY_PATH
is defined, then the dynamic linker searches for the shared library in the directories it lists before looking in the standard library directories. (Later, we’ll see that a production application should never rely on LD_LIBRARY_PATH
, but for now, this variable provides us with a simple way of getting started with shared libraries.) Thus, we can run our program with the following command:
$ LD_LIBRARY_PATH=. ./prog
Called mod1-x1
Called mod2-x2
The (bash, Korn, and Bourne) shell syntax used in the above command creates an environment variable definition within the process executing prog. This definition tells the dynamic linker to search for shared libraries in .
, the current working directory.
An empty directory specification in the LD_LIBRARY_PATH
list (e.g., the middle specification in dirx::diry) is equivalent to .
, the current working directory (but note that setting LD_LIBRARY_PATH
to an empty string does not achieve the same result). We avoid this usage (SUSv3 discourages the corresponding usage in the PATH
environment variable).
Commonly, the term linking is used to describe the use of the linker, ld, to combine one or more compiled object files into a single executable file. Sometimes, the term static linking is used to distinguish this step from dynamic linking, the run-time loading of the shared libraries used by an executable. (Static linking is sometimes also referred to as link editing, and a static linker such as ld is sometimes referred to as a link editor.) Every program—including those that use shared libraries—goes through a static-linking phase. At run time, a program that employs shared libraries additionally undergoes dynamic linking.
In the example presented so far, the name that was embedded in the executable and sought by the dynamic linker at run time was the actual name of the shared library file. This is referred to as the library’s real name. However, it is possible—in fact, usual—to create a shared library with a kind of alias, called a soname (the DT_SONAME
tag in ELF parlance).
If a shared library has a soname, then, during static linking, the soname is embedded in the executable file instead of the real name, and subsequently used by the dynamic linker when searching for the library at run time. The purpose of the soname is to provide a level of indirection that permits an executable to use, at run time, a version of the shared library that is different from (but compatible with) the library against which it was linked.
In Shared Library Versions and Naming Conventions, we’ll look at the conventions used for the shared library real name and soname. For now, we show a simplified example to demonstrate the principles.
The first step in using a soname is to specify it when the shared library is created:
$gcc -g -c -fPIC -Wall mod1.c mod2.c mod3.c
$gcc -g -shared -Wl,-soname,libbar.so -o libfoo.so mod1.o mod2.o mod3.o
The -Wl,-soname,libbar.so option is an instruction to the linker to mark the shared library libfoo.so
with the soname libbar.so
.
If we want to determine the soname of an existing shared library, we can use either of the following commands:
$objdump -p libfoo.so | grep SONAME
SONAME libbar.so $readelf -d libfoo.so | grep SONAME
0x0000000e (SONAME) Library soname: [libbar.so]
Having created a shared library with a soname, we then create the executable as usual:
$ gcc -g -Wall -o prog prog.c libfoo.so
However, this time, the linker detects that the library libfoo.so
contains the soname libbar.so
and embeds the latter name inside the executable.
Now when we attempt to run the program, this is what we see:
$ LD_LIBRARY_PATH=. ./prog
prog: error in loading shared libraries: libbar.so: cannot open
shared object file: No such file or directory
The problem here is that the dynamic linker can’t find anything named libbar.so
. When using a soname, one further step is required: we must create a symbolic link from the soname to the real name of the library. This symbolic link must be created in one of the directories searched by the dynamic linker. Thus, we could run our program as follows:
$ln -s libfoo.so libbar.so
Create soname symbolic link in current directory $LD_LIBRARY_PATH=. ./prog
Called mod1-x1 Called mod2-x2
Figure 41-1 shows the compilation and linking steps involved in producing a shared library with an embedded soname, linking a program against that shared library, and creating the soname symbolic link needed to run the program.
Figure 41-2 shows the steps that occur when the program created in Figure 41-1 is loaded into memory in preparation for execution.
To find out which shared libraries a process is currently using, we can list the contents of the corresponding Linux-specific /proc/
PID/maps
file (Location of Shared Memory in Virtual Memory).