Building the kernel

Having decided which kernel to base your build on, the next step is to build it.

Let's assume that you have a board that is supported in mainline. You can get the source code through git or by downloading a tarball. Using git is better because you can see the commit history, you can easily see any changes you may make and you can switch between branches and versions. In this example, we are cloning the stable tree and checking out the version tag 4.1.10:

Alternatively, you could download the tarball from https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.1.10.tar.xz.

There is a lot of code here. There are over 38,000 files in the 4.1 kernel containing C source code, header files, and assembly code, amounting to a total of over 12.5 million lines of code (as measured by the cloc utility). Nevertheless, it is worth knowing the basic layout of the code and to know, approximately, where to look for a particular component. The main directories of interest are:

  • arch: This contains architecture-specific files. There is one subdirectory per architecture.
  • Documentation: This contains kernel documentation. Always look here first if you want to find more information about an aspect of Linux.
  • drivers: This contains device drivers, thousands of them. There is a subdirectory for each type of driver.
  • fs: This contains filesystem code.
  • include: This contains kernel header files, including those required when building the toolchain.
  • init: This contains the kernel start-up code.
  • kernel: This contains core functions, including scheduling, locking, timers, power management, and debug/trace code.
  • mm: This contains memory management.
  • net: This contains network protocols.
  • scripts: This contains many useful scripts including the device tree compiler, dtc, which I described in Chapter 3, All About Bootloaders.
  • tools: This contains many useful tools, including the Linux performance counters tool, perf, which I will describe in Chapter 13, Profiling and Tracing.

Over a period of time, you will become familiar with this structure, and realize that, if you are looking for the code for the serial port of a particular SoC, you will find it in drivers/tty/serial and not in arch/$ARCH/mach-foo because it is a device driver and not something central to the running of Linux on that SoC.

One of the strengths of Linux is the degree to which you can configure the kernel to suit different jobs, from a small dedicated device such as a smart thermostat to a complex mobile handset. In current versions there are many thousands of configuration options. Getting the configuration right is a task in itself but, before that, I want to show you how it works so that you can better understand what is going on.

The configuration mechanism is called Kconfig, and the build system that it integrates with is called Kbuild. Both are documented in Documentation/kbuild/. Kconfig/Kbuild is used in a number of other projects as well as the kernel, including crosstool-NG, U-Boot, Barebox, and BusyBox.

The configuration options are declared in a hierarchy of files named Kconfig using a syntax described in Documentation/kbuild/kconfig-language.txt. In Linux, the top level Kconfig looks like this:

The last line includes the architecture-dependent configuration file which sources other Kconfig files depending on which options are enabled. Having the architecture play such a role has two implications: firstly, that you must specify an architecture when configuring Linux by setting ARCH=[architecture], otherwise it will default to the local machine architecture, and second that the layout of the top level menu is different for each architecture.

The value you put into ARCH is one of the subdirectories you find in directory arch, with the oddity that ARCH=i386 and ARCH=x86_64 both have the source arch/x86/Kconfig.

The Kconfig files consist largely of menus, delineated by menu, menu title, and endmenu keywords, and menu items marked by config. Here is an example, taken from drivers/char/Kconfig:

The parameter following config names a variable that, in this case, is DEVMEM. Since this option is a Boolean, it can only have two values: if it is enabled it is assigned to y, if not the variable is not defined at all. The name of the menu item that is displayed on the screen is the string following the bool keyword.

This configuration item, along with all the others, is stored in a file named .config (note that the leading dot '.' means that it is a hidden file that will not be shown by the ls command unless you type ls -a to show all files). The variable names stored in .config are prefixed with CONFIG_, so if DEVMEM is enabled, the line reads:

There are several other data types in addition to bool. Here is the list:

There may be dependencies between items, expressed by the depends on phrase, as shown here:

If CONFIG_MTD has not been enabled elsewhere, this menu option is not shown and so cannot be selected.

There are also reverse dependencies: the select keyword enables other options if this one is enabled. The Kconfig file in arch/$ARCH has a large number of select statements that enable features specific to the architecture, as can be seen here for arm:

There are several configuration utilities that can read the Kconfig files and produce a .config file. Some of them display the menus on screen and allow you to make choices interactively. Menuconfig is probably the one most people are familiar with, but there is also xconfig and gconfig.

You launch each one via make, remembering that, in the case of the kernel, you have to supply an architecture, as illustrated here:

Here, you can see menuconfig with the DEVMEM config option highlighted in the previous paragraph:

The star (*) to the left of an item means that it is selected (="y") or, if it is an M, that it has been selected to be built as a kernel module.

With so many things to configure, it is unreasonable to start with a clean sheet each time you want to build a kernel so there are a set of known working configuration files in arch/$ARCH/configs, each containing suitable configuration values for a single SoC or a group of SoCs. You can select one with make [configuration file name]. For example, to configure Linux to run on a wide range of SoCs using the armv7-a architecture, which includes the BeagleBone Black AM335x, you would type:

This is a generic kernel that runs on various different boards. For a more specialized application, for example when using a vendor-supplied kernel, the default configuration file is part of the board support package; you will need to find out which one to use before you can build the kernel.

There is another useful configuration target named oldconfig. This takes an exiting .config file and asks you to supply configuration values for any options that don't have them. You would use it when moving a configuration to a newer kernel version: copy .config from the old kernel to the new source directory and run make ARCH=arm oldconfig to bring it up to date. It can also be used to validate a .config file that you have edited manually (ignoring the text Automatically generated file; DO NOT EDIT that occurs at the top: sometimes it is OK to ignore warnings).

If you do make changes to the configuration, the modified .config file becomes part of your device and needs to be placed under source code control.

When you start the kernel build, a header file, include/generated/autoconf.h, is generated which contains a #define for each configuration value so that it can be included in the kernel source, exactly as with U-Boot.

You can discover the kernel version that you have built using the make kernelversion target:

This is reported at runtime through the uname command and is also used in naming the directory where kernel modules are stored.

If you change the configuration from the default it is advisable to append your own version information, which you can configure by setting CONFIG_LOCALVERSION, which you will find in the General setup configuration menu. It is also possible (but discouraged) to do the same by editing the top level makefile and appending it to the line that begins with EXTRAVERSION. As an example, if I wanted to mark the kernel I am building with an identifier melp and version 1.0, I would define the local version in the .config file like this:

Running make kernelversion produces the same output as before but now, if I run make kernelrelease, I see:

It is also printed at the beginning of the kernel log:

I can now identify and track my custom kernel.

I have mentioned kernel modules several times already. Desktop Linux distributions use them extensively so that the correct device and kernel functions can be loaded at runtime depending on the hardware detected and features required. Without them, every single driver and feature would have to be statically linked in to the kernel, making it unfeasibly large.

On the other hand, with embedded devices, the hardware and kernel configuration is usually known at the time the kernel is built so modules are not so useful. In fact, they cause a problem because they create a version dependency between the kernel and the root filesystem which can cause boot failures if one is updated but not the other. Consequently, it is quite common for embedded kernels to be built without any modules at all. Here are a few cases where kernel modules are a good idea: