Linking with libraries: static and dynamic linking

Any application you write for Linux, whether it be in C or C++, will be linked with the C library, libc. This is so fundamental that you don't even have to tell gcc or g++ to do it because it always links libc. Other libraries that you may want to link with have to be explicitly named through the -l option.

The library code can be linked in two different ways: statically, meaning that all the library functions your application calls and their dependencies are pulled from the library archive and bound into your executable; and dynamically, meaning that references to the library files and functions in those files are generated in the code but the actual linking is done dynamically at runtime.

Static linking is useful in a few circumstances. For example, if you are building a small system which consists of only BusyBox and some script files, it is simpler to link BusyBox statically and avoid having to copy the runtime library files and linker. It will also be smaller because you only link in the code that your application uses rather than supplying the entire C library. Static linking is also useful if you need to run a program before the filesystem that holds the runtime libraries is available.

You tell gcc to link all libraries statically by adding -static to the command line:

You will notice that the size of the binary increases dramatically:

Static linking pulls code from a library archive, usually named lib[name].a. In the preceding case it is libc.a, which is in [sysroot]/usr/lib:

Note that the syntax $(arm-cortex_a8-linux-gnueabihf-gcc -print-sysroot) places the output of the program on the command line. I am using it as a generic way to refer to the files in the sysroot.

Creating a static library is as simple as creating an archive of object files using the ar command. If I had two source files named test1.c and test2.c and I want to create a static library named libtest.a, then I would do this:

Then I could link libtest into my helloworld program using:

A more common way to deploy libraries is as shared objects that are linked at runtime, which makes more efficient use of storage and system memory, since only one copy of the code needs to be loaded. It also makes it easy to update library files without having to re-link all the programs that use them.

The object code for a shared library must be position-independent so that the runtime linker is free to locate it in memory at the next free address. To do this, add the -fPIC parameter to gcc, and then link it using the -shared option:

To link an application with this library, you add -ltest, exactly as in the static case mentioned in the preceding paragraph but, this time, the code is not included in the executable, but there is a reference to the library that the runtime linker will have to resolve:

The runtime linker for this program is /lib/ld-linux-armhf.so.3, which must be present in the target's filesystem. The linker will look for libtest.so in the default search path: /lib and /usr/lib. If you want it to look for libraries in other directories as well, you can place a colon-separated list of paths in the shell variable LD_LIBRARY_PATH:

One of the benefits of shared libraries is that they can be updated independently of the programs that use them. Library updates are of two types: those that fix bugs or add new functions in a backwards-compatible way, and those that break compatibility with existing applications. GNU/Linux has a versioning scheme to handle both these cases.

Each library has a release version and an interface number. The release version is simply a string that is appended to the library name, for example the JPEG image library, libjpeg, is currently at release 8.0.2 and so the library is named libjpeg.so.8.0.2. There is a symbolic link named libjpeg.so to libjpeg.so.8.0.2 so that, when you compile a program with –ljpeg, you link with the current version. If you install version 8.0.3, the link is updated and you will link with that one instead.

Now, suppose that version 9.0.0 comes along and that breaks backwards compatibility. The link from libjpeg.so now points to libjpeg.so.9.0.0, so that any new programs are linked with the new version, possibly throwing compile errors when the interface to libjpeg changes, which the developer can fix. Any programs on the target that are not recompiled are going to fail in some way because they are still using the old interface. This is where the soname helps. The soname encodes the interface number when the library was built and is used by the runtime linker when it loads the library. It is formatted as <library name>.so.<interface number>. For libjpeg.so.8.0.2, the soname is libjpeg.so.8:

Any program compiled with it will request libjpeg.so.8 at runtime which will be a symbolic link on the target to libjpeg.so.8.0.2. When version 9.0.0 of libjpeg is installed, it will have a soname of libjpeg.so.9, and so it is possible to have two incompatible versions of the same library installed on the same system. Programs that were linked with libjpeg.so.8.*.* will load libjpeg.so.8, and those linked with libjpeg.so.9.*.* will load libjpeg.so.9.

This is why, when you look at the directory listing of <sysroot>/usr/lib/libjpeg*, you find these four files:

The first two are only needed on the host computer for building, the last two are needed on the target at runtime.