Creating Directory Data

In the previous section we looked at the two LDAP daemons, SLAPD and SLURPD. But though we have a directory running already, we do not have any entries in our directory (other than the ones that are created by SLAPD, such as schema records and the root DSE).

In this section we will create a file for holding our LDAP data, and we will devise some directory entries to go in this file. In the next section we will load the data into the directory.

Throughout this book we look at examples of LDAP records presented in plain text, with each line having an attribute description, followed by a colon and a value. The first line of the record is the DN, and usually the last lines of the record are the object class attributes:

dn: uid=bjensen,dc=exaple,dc=com
cn: Barbara Jensen
mail: bjensen@example.com
uid: bjensen
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson

This format is the standard way of representing LDAP directory entries in a text file. It is an example of a record written in the LDAP Data Interchange Format (LDIF), version 1.

The LDIF standard defines a file format not only for representing the contents of a directory, but for representing certain LDAP operations, such as additions, changes, and deletions. In the section on the ldapmodify client, we will use LDIF to specify changes to records in the directory server, but right now we are interested in creating a file that represents the contents of our directory.

An LDIF file consists of a list of records, each of which represents an entry in the directory. Each entry must have a DN (since any LDAP entry requires a DN), and then one or more attributes or change records (add, modify, delete, modrdn, moddn). For now we will confine ourselves to attributes, and put off discussion of change records until we discuss ldapmodify.

Records are separated by empty lines, and each record must begin with a DN:

# First Document: "On Liberty" by J.S. Mill
dn: documentIdentifier=001,dc=example,dc=com
documentIdentifier: 001
documentTitle: On Liberty
documentAuthor: cn=John Stuart Mill,dc=example,dc=com
objectClass: document
objectClass: top

# Second Document: "Treatise on Human Nature" by David Hume
dn: documentIdentifier=002,dc=example,dc=com
documentIdentifier: 002
documentTitle: Treatise on Human Nature
documentAuthor: cn=David Hume,dc=example,dc=com
objectClass: document
objectClass: top

Lines that begin with a pound or number sign (#) are treated as comments, and ignored. Note that the pound sign must be the first character on the line, not preceded by any whitespace characters.

While it is customary for records to end with the objectClass attributes, this is done because it is considered easier to read. There is no requirement to do so. The order of attributes in an LDIF record is inconsequential.

An object class (which is defined in a schema definition) indicates what type or types of object the record represents. In the precvious example, the two records are both documents. The object class definition determines which attributes are required, and which are merely allowed. When authoring an LDIF file you will need to know which fields are required. The DN of any entry is, of course, required, as is the objectclass attribute. In the top object class, which represents the root of the schema hierarchy, there are no required fields other than objectclass. The document object class definition requires documentIdentifier, and allows eleven additional fields, including documentTitle (which takes a string value) and documentAuthor (which takes a DN value, pointing to another record in the directory).

Let's look at the list of attributes for the document and documentSeries object classes:

Anatomy of an LDIF File

Any attributes that are used in the DN but are not part of the directory's base DN must be present in the record. For example, consider the case where the base DN is dc=example,dc=com. An entry with the DN cn=Matt,dc=example,dc=com would have to have a cn attribute with the value Matt. In the previous examples, since documentIdentifier is used in the DN, there must be a matching documentIdentifier attribute in the record.

Likewise, an entry with the DN cn=Matt,ou=Users,dc=example,dc=com would have to have the attributes cn:Matt and ou:Users.

Not all attribute values are simple and short ASCII strings. LDIF provides facilities for encoding more complex types of data.

Sometimes attribute values won't fit on one line. If an attribute value is too long to fit on one line it can be continued on the next line, provided that the first character on the continued line is a whitespace character:

According to the RFC, an LDIF file can only contain characters in the ASCII character set. However, characters that are not in ASCII can be represented in LDIF using a base-64 encoded value. Entries whose value is base-64 encoded differ slightly. The attribute description is followed by two colons, instead of one:

You should consider base-64 encoding under the following circumstances:

  • When the attribute value contains binary data (such as a JPEG photo).
  • When the character set is not ASCII. Generally, the directory data should be stored in UTF-8, but that means that in order to remain compliant to the LDIF standard, values should be base-64 encoded.
  • When there are line breaks or other non-printing characters within the value. (Note that for such values to be accepted the schema must allow these characters or the directory server will not allow them to be uploaded even if they are encoded.)
  • When the value begins with or ends with whitespace characters (that you want preserved), or begins with a colon (:) or a less-than sign (<).

Even DNs can be base-64 encoded, and you can use UTF-8 characters in a DN as long as the DN is base-64 encoded.

There are several UNIX/Linux utilities which can be used to base-64 encoded values. The most popular is the uuencode program that comes in the sharutils package. However, this program is not installed by default in Ubuntu. You can install it quickly from the command line with apt-get:

  $ sudo apt-get install sharutils

Once sharutils is installed you can encode a value with uuencode:

$ echo -n " test" | uuencode -m name
begin-base64 644 name
IHRlc3Q=
====

In this example we are converting the string " test" (note the leading white space) into a base-64 encoded string. This is done with a couple of commands on the command line (using the Bash shell in this example).

The uuencode command is typically used to encode files for attachment to an email message, so we have to do a little work to get it to operate the way we want. First, we echo the string that we want to encode. The echo program, by default, adds a newline character onto the end of the string that it echoes. We use the -n flag to prevent it from adding the newline character.

The string " test" is echoed to the standard output (/dev/stdout), and then piped (|) into the uuencode command. The -m flag instructs uuencode to use base-64 encoding, and the name string is used by uuencode to generate a name for the attachment. While this is useful when using uuencode to generate email attachements, it serves no purpose for us. Since we are not attaching this file to anything it doesn't really matter what you put there; foo would work equally as well.

The uuencode program then prints three lines of output:

begin-base64 644 name
IHRlc3Q=
====

Only the second line of the code (highlighted one), the actual base-64 encoded value, matters to us. We can copy IHRlc3Q= and paste it into our LDIF file.

In some cases, inserting a lengthy attribute value (such as the entire base-64 encoded image file, or even a lengthy bit of text) into the LDIF file would make the file too large to efficiently edit with a text editor. Even a small image file would be hundreds of characters long when base-64 encoded. Instead of inserting the base-64 encoded string directly into the file you can use a special file reference, and the contents of the file will be retrieved and loaded into the directory when the LDIF file is imported.

The highlighted line of code shows an example of inserting a reference to an external file.

There are two important features to note in this example:

In cases where you have attribute values in multiple languages you can store language information along with the attribute description:

dn: documentIdentifier=006,dc=example,dc=com
documentIdentifier: 006
documentTitle;lang-en: On Generation and Corruption
documentTitle;lang-la: De Generatione et Corruptione
documentAuthor: cn=Aristotle,dc=example,dc=com
objectClass: document
objectClass: top

The language information is stored in the directory, and clients will be able to use it to display the language appropriate to the locale.

This covers the basics of the LDIF file format, now we will move on and create an LDIF file to load into the directory.

Now we are ready to model our directory tree in an LDIF file. The first thing to do is to decide on a directory structure. We are going to represent an organization in our directory tree. Of course the possibilities for the types of trees you can model are boundless, but we will stick to those most commonly used for LDAP directories.

There are two popular ways of defining the roots of an organizational directory tree:

Using the organization/country configuration has its advantages. Corporations with multiple domains may find this form more appealing. But the second form, relying upon domain components instead, has become much more prevalent. In most circumstances, I prefer the domain component form because it is more closely related to the way much information is referenced on the Internet.

Of course, there is no hard and fast rule about how exactly the DN must be structured, and you may find other base DN structures more appealing.

Now that we have chosen a base DN style, let's begin building a directory for Example.Com. LDIF files are read sequentially, record by record. So, the base DN must come first, since all other records will refer to it in their DNs. Likewise, as we build the directory information tree, we will need to make sure that the parent entries always appear in the file before their children.

Our base DN looks like this:

Let's start from the bottom and work backwards through the example. The record has three object classes: top, dcObject, and organization. As we have seen already, the top object class is the root of the hierarchy of object classes, and all records within the directory are in the top object class.

Here is the figure displaying the object classes:

Defining the Base DN Record

The dcObject object class simply describes domain components—pieces of a domain name. The domain www.packtpub.com, for example, has three domain components: www, packtpub, and com. Since we are using domain components in the DN, we need the dcObject class, which requires one attribute: dc.

You may notice that while in the DN there are two dc attributes (dc=example and dc=com), there is only one (dc:example) listed in the record. While it seems counter-intuitive at first glance, the reason is actually straightforward. The record is not describing the entire domain—just a single domain component (example). Like a DNS record, the parent component (com) refers to another entity somewhere else in a great big hierarchy.

So, each record that uses the dcObject object class can describe only one domain component, and hence have only one dc attribute in the record (though the DN may have multiple dc attributes, specifying in which part of the domain hierarchy this record resides).

But is the dc=com record supposed to be in our directory? Since the root of this directory (as specified in the slapd.conf file) is dc=example,dc=com, we would not expect to find the dc=com record within the database, as it is not under the dc=example,dc=com part of the tree (rather, dc=com is above, or superior to, this part of the tree).

Now we have specified what domain component our record describes. But we still need a little more. We can't just have a record with top and dcObject object classes for two reasons—one practical and the other technical.

Practically speaking, the record would not be particularly useful with just this sparse information, as it wouldn't really tell us about the base of the directory tree (other than that, it has a domain name).

Technically speaking, neither of the two object classes, top and dcObject, are sufficient for a complete record. The reason for this is that neither of these object classes are structural object classes, (top is abstract, and dcObject is auxiliary) and every record in the directory must have one object class that is considered the structural object class for that record. For a detailed explanation, as well as some useful information about structuring records, see Chapter 6.

What would make our base record more useful (and fulfill the record's requirement to have a structural object class)? The organization object class describes an organization, as the name suggests. It requires one field, o (or its synonym, organizationName), which is used to specify the (legal) name of the organization. Additionally the organization object class allows twenty-one optional fields that provide more detailed information about the organization, such as postalAddress, telephoneNumber, and location. In the previous example we used the description field, which is also among the twenty-one attributes allowed by the organization object class.

That is our base entry for our directory. It describes the record at the root of our directory information tree. Next we want to add some structure to our directory.

One of the strengths of LDAP's directory server model is its ability to represent data organized into hierarchies. In this section, we will use Organizational Units (OUs) to create a several subtrees beneath our dc=example,dc=com root.

Our Example.Com directory is intended primarily for holding user and account information. For that reason, we will want to use Organizational Units to create subtrees.

OpenLDAP does not provide a default OU subtree structure, so you will need to create your own. This can be done in many ways, but here we will see the two prominent theories of how OUs should be structured.

The first theory is that the directory should be structured to represent the organizational chart of the organization you are modeling. For example, if the organization has three main units—Accounting, Human Resources (HR), and Information Technology (IT)—then you should have three OUs. Here is a figure for the same:

Theory 1: Directory as Organizational Chart

In the given screenshot, each OU represents a unit in the organizational chart. Employees who work in Accounting will have their user accounts in the directory subtree ou=Accounting,dc=example,dc=com, while employees in IT will have accounts in ou=IT,dc=example,dc=com.

This method has some obvious advantages. Knowledge of how the organization works will help you locate information in the directory. Conversely, the directory will serve as a tool for understanding how the organization is structured. Organizational relationships between people or records in the directory will be more easily ascertained. For example, a glance at the record (or just the DN) of uid=Marvin,ou=Accounting,dc=example,dc=com, and you will know that Marvin works in the same department as Barbara.

There are a few things to consider before structuring your directory this way though:

The second theory is that the directory should be structured to represent the way your system (networks, servers, user applications) will need to access the records. In this case the structure of the LDAP directory should be optimized for use by such IT services. While the organizational chart technique groups records by their relation to the organization, this method groups records into functional units, where a position in the directory is determined primarily by the tasks that applications and services will require the directory to perform.

One common way to structure the directory is to split it into a unit for users, a unit for groups, and a unit for system-level records that applications need, but users will not require access to. Let's see an example:

Theory 2: Directory as IT Service

In this case, all of the user accounts are under a particular subtree of the directory: ou=Users,dc=example,dc=com. Applications need only search in one part of the directory to find user accounts, and when the organization changes, the structure of the directory need not also change.

This method, Also has some drawbacks. First, the directory structure does not, by design, provide any overt clues to the structure of the organization. Of course organizational information, such as department IDs, can be stored in individual records, and so can be retrieved that way.

More importantly though, if the directory supports a large number of users, the ou=Users branch is going to have a lot of records. This is not necessarily a performance problem, but it can make browsing the directory (as opposed to searching the directory) a tedious process.

In some cases, this problem is mitigated by adding additional subtrees under the user's branch. Sometimes this is done by creating a hybrid configuration where ou=User has subtrees that represent departments in the organization, such as ou=Accounting,ou=Users,dc=example,dc=com. Sometimes other classification systems, such as alphabetical schemes, are used to handle this situation: uid=matt,ou=m-p,ou=Users,dc=example,dc=com.

But for small and medium-sized ones, a user's branch typically does not have any additional subtrees, which eases the process of integrating with other applications.

LDAP also has object classes designed to describe groups of records in the directory. Usually, it does not make sense to store these in with the user accounts, so they can be moved to a separate branch.

Finally, the System branch is used to store records for things like system accounts, mail servers, web servers, and other miscellaneous applications often need (or perform best with) their own LDAP accounts. But if it can be helped, they shouldn't be grouped in with user accounts.

I've outlined two different ways of structuring the directory information tree—one mirroring the organization, and the other facilitating IT services. But these are only two ways of structuring the directory. You may find that other structures meet your needs better. However, for our purposes, we will use the IT services structure as we continue to build our LDIF file.

Now we are ready to write out our chosen OUs in LDIF. We will create three OUs—Users, Groups, and System—as follows:

The three OUs have the same structure.

Each OU must have the organizationalUnit object class. This object class has one required attribute: ou. Here is a figure displaying the organizationalUnit:

Expressingdirectory treedirectory, structuring the OUs in LDIF

The description attribute is optional and there are more than twenty additional (optional) attributes that can be added—most of which provide contact information of the organization unit, such as telephoneNumber, postOfficeBox, and postalAddress.

With our OUs in place we are ready to add a third tier to our directory tree. Before we start creating individual records let's get an overview of what this next tier will look like. Here is the directory tree structure with a group, a system account, and a pair of users:

Expressingdirectory treedirectory, structuring the OUs in LDIF

This is the directory information tree that we will create in the remainder of this section. Next, we will continue building an LDIF file first by adding the users, followed by a system record, and then a group.

We will reserve the Users OU for records that describe people in the organization. In these accounts we want to store information about the user—things like first and last name, title, and department. Since the directory will also be a central resource for application information, we also want to store user ID, email address, and password.

A basic user record looks like this:

The user record for Barbara belongs to three object classes: person, organizationalPerson, and inetOrgPerson. All three of these are structural object classes, where inetOrgPerson is a child of the organizationalPerson class, which, in turn, is a child of the person object class. The attributes in Barbara's record are a mixture of the required and allowed attributes from the three object classes. The following figure displays the attributes in Barbara's record:

Addingdirectory treedirectory, structuring User Records

Since inetOrgPerson inherits from organizationalPerson, a record that has the inetOrgPerson object class also must have the organizationalPerson object class. And organizationalPerson inherits from the person object class, so person, is also required.

This means that all of the inetOrgPerson records will require cn (the user's full name) and sn (the user's surname) attributes, as all inetOrgPerson records are also person records. It also means that the record can have any combination of the forty-nine optional attributes defined between the three object classes.

An inetOrgPerson record that utilizes more of the available attributes might look like this:

In this example we are still using the same three object classes, but have selected many more of the optional attributes. One thing that may stand out in both Barbara's and Matt's records, is that there are an awful lot of attributes used simply for specifying the name of the person; cn, sn, givenName, and displayName are all fields related to the person's name. What's the point in having so many? There are two benefits achieved by providing diverse name fields:

In the previous examples the userPassword field, which contains the person's password, is in plain text. When this file is loaded into the directory, the value will be base-64 encoded, but it will not be encrypted. It is not at all secure to store clear-text passwords in the directory (and base-64 encoding does not improve the security of the password). Later in this section we will look at the ldappasswd tool, which encrypts passwords before storing them in the directory. Production directories should always store the userPassword value in encrypted form.

Both of these examples use the inetOrgPerson object class as their primary structural object class. This is because these records describe a person and use the uid attribute (and use it as part of the DN). Additionally, inetOrgPerson provides a number of attributes that are useful for modern information infrastructures; jpegPhoto, preferredLanguage, and displayName (amongst others) are all intended to be used primarily by modern computer agents rather than humans. As it is standardized and widely deployed (LDAP servers from Sun to Microsoft use it), it is the preferred object class for describing people within an organization.

Thus far we have created a base DN entry, some organizational units, and a few users. Now we will add a record describing a system account.

Some of the entries in our tree—entries that we will need—do not describe users, and so do not belong in the Users organizational unit (OU). Instead, we will put such special records in the System OU. Likewise, the entities we are describing are not people, and so using the person, organizationalPerson, and inetOrgPerson object classes is not appropriate.

In this section we will create a new record for an account that will assist users in logging in. The function of the account will be described in detail in Chapter 4, but this account will need to be able to authenticate to the directory server and perform operations. But, again, this account is not for a specific person, and so it will not have personal data (like a surname or a given name).

Here's what our new system account, called authenticate, looks like:

This record has two object classes: account and simpleSecurityObject. The first one, account, is the structural object class. An account object, which is defined in the Cosine schema (cosine.schema), describes an account used to access computers or networks. Let's have a look at the two object classes:

Adding System Records

Our account, whose DN is uid=authenticate,ou=System,dc=example,dc=com, uses the uid attribute required by the account object class, as well as the ou and description fields from account. But the account object class does not have a field for storing a password. For that reason we need to add to the record the auxiliary object class simpleSecurityObject, which has one attribute: the required attribute userPassword.

By adding the simpleSecurityObject auxiliary object class, we have now made it possible for our account record to have a password. Again, in our example, we have specified the password (userPassword: secret) in clear text. It is not safe to store unencrypted passwords in the directory. For information on encrypting LDAP passwords, see the section on ldappasswd later in this chapter.

Now we have created some records under two of our three organizational units: Users and System. Next, we will add a group under the Groups OU.

The last record we will add to our LDIF file is a record that describes a group of DNs. Groups provide a flexible method for collecting similar DNs by whatever criterion is needed. The DNs in a group do not have to be structurally similar—they can have completely different attributes and object classes, and can describe completely different things (such as a document and a person). Thus, it is up to the directory administrators and directory applications as to what sorts of DNs will be grouped into any particular group.

In our case, we are going to create a group to represent our directory administrators, and all of the DNs that belong to this group are DNs for users (in the Users OU, and with the inetOrgPerson structural object class).

Our group has the DN cn=LDAP Admins,ou=Groups,dc=example,dc=com. Note that we use the cn attribute, rather than uid, to identify the group. That is because the groupOfUniqueNames object class does not allow a uid attribute (and cn is required).

A groupOfUniqueNames class is one of three grouping object classes defined in the core LDAP version 3 schema (core.schema). The other two are groupOfNames and organizationalRole.

These have been diplayed in the following figure:

Adding Group Records

All three of these object classes are designed for collecting DNs. Each has an attribute that specifies the DN of a member of the group. In groupOfNames, the attribute is called, simply enough, member. The groupOfUniqueNames class, which does not differ in function from groupOfNames, uses uniqueMember as its membership attribute. The organizationalRole grouping class, which is intended to represent the group responsible for performing a particular role in the context of the organization, uses the roleOccupant attribute for membership.

In all three grouping object classes, the membership attribute (member, uniqueMember, or roleOccupant) can be specified multiple times, as we saw in the LDIF snippet for the LDAP Admins group.

The groupOfUniqueNames and groupOfNames object classes both allow the owner attribute, which can also be used more than once (to, for example, model cases where a group has two owners). An owner attribute holds the DN of the record that is considered the owner of the group.

In our example group, which is groupOfUniqueNames, we specified two uniqueMember attributes:

Both of these DNs are members of the group. Note that SLAPD does not actively check to make sure that these DNs exist, nor does it automatically remove a DN from groups when the DN is removed from the directory.

Thus, directory administrators and directory applications must be careful to perform additional verification and cleanup when working with groups. When a DN is deleted from the directory, a directory-wide search for attributes that take DN values should be performed to make sure that attributes such as member and roleOccupant (and, for that matter, seeAlso) do not point to the newly-deleted DN.

Finally , we have finished building our LDIF file. We will save it in a file named basics.ldif, since it contains the basic elements of our directory. Here is what it looks like:

# This is the root of the directory tree
dn: dc=example,dc=com
description: Example.Com, your trusted non-existent corporation.
dc: example
o: Example.Com
objectClass: top
objectClass: dcObject
objectClass: organization

# Subtree for users
dn: ou=Users,dc=example,dc=com
ou: Users
description: Example.Com Users
objectClass: organizationalUnit

# Subtree for groups
dn: ou=Groups,dc=example,dc=com
ou: Groups
description: Example.Com Groups
objectClass: organizationalUnit

# Subtree for system accounts
dn: ou=System,dc=example,dc=com
ou: System
description: Special accounts used by software applications.
objectClass: organizationalUnit

##
## USERS
##

# Matt Butcher
dn: uid=matt,ou=Users,dc=example,dc=com
ou: Users
# Name info:
uid: matt
cn: Matt Butcher
sn: Butcher
givenName: Matt
givenName: Matthew
displayName: Matt Butcher
# Work Info:
title: Systems Integrator
description: Systems Integration and IT for Example.Com
employeeType: Employee
departmentNumber: 001
employeeNumber: 001-08-98
mail: mbutcher@example.com
mail: matt@example.com
roomNumber: 301
telephoneNumber: +1 555 555 4321
mobile: +1 555 555 6789
st: Illinois
l: Chicago
street: 1234 Cicero Ave.
# Home Info:
homePhone: +1 555 555 9876
homePostalAddress: 1234 home street $ Chicago, IL $ 60699-1234
# Misc:
userPassword: secret
preferredLanguage: en-us,en-gb
# Object Classes:
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson

# Barbara Jensen:
dn: uid=barbara,ou=Users,dc=example,dc=com
ou: Users
uid: barbara
sn: Jensen
cn: Barbara Jensen
givenName: Barbara
displayName: Barbara Jensen
mail: barbara@example.com
userPassword: secret
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson

# LDAP Admin Group:
dn: cn=LDAP Admins,ou=Groups,dc=example,dc=com
cn: LDAP Admins
ou: Groups
description: Users who are LDAP administrators
uniqueMember: uid=barbara,dc=example,dc=com
uniqueMember: uid=matt,dc=example,dc=com
objectClass: groupOfUniqueNames

# Special Account for Authentication:
dn: uid=authenticate,ou=System,dc=example,dc=com
uid: authenticate
ou: System
description: Special account for authenticating users
userPassword: secret
objectClass: account
objectClass: simpleSecurityObject

In the next section, we will look at the OpenLDAP utilities, and we will use these utilities to load our LDIF file into the directory.