Chapter 12. Planning Active Directory Sites
As part of the design of Active Directory Domain Service, you should examine the network topology and determine if you need to manage network traffic between subnets or business locations. To manage network traffic related to Active Directory, you use sites, which can be used to reflect the physical topology of your network. Every Active Directory implementation has at least one site. An important part of understanding sites involves understanding Active Directory replication. Active Directory uses two replication models: one model for replication within sites and one model for replication between sites. In order to plan your site structure, you need a solid understanding of these replication models.
A site is a group of Transmission Control Protocol/Internet Protocol (TCP/IP) subnets that are implemented to control directory replication traffic and isolate logon authentication traffic between physical network locations. Each subnet that is part of a site should be connected by reliable, high-speed links. Any business location connected over slow or unreliable links should be part of a separate site. Because of this, individual sites typically represent the individual local area networks (LANs) within an organization and the wide area network (WAN) links between business locations typically mark the boundaries of these sites. However, you can also use sites in other ways.
Sites do not reflect the Active Directory namespace. Domain and site boundaries are separate. From a network topology perspective, a single site can contain multiple TCP/IP subnets as well. However, a single subnet can be in only one site. This means that the following conditions apply:
As you design the site structure, you have many options. Sites can contain a domain or a portion of a domain. A single site can have one subnet or multiple subnets. Note that replication is handled differently between sites than it is within sites. Replication that occurs within a site is referred to as intrasite replication. Replication between sites is referred to as intersite replication. Each side of a site connection has one or more designated bridgehead servers.
The figure that follows shows an example of an organization that has one domain and two sites at the same physical location. Here, the organization has an East Campus site and a West Campus site. As you can see, the organization has multiple domain controllers at each site. The domain controllers in the East Campus site perform intrasite replication with one another, as do the domain controllers in the West Campus site. Designated servers in each site, referred to as site bridgehead servers, perform intersite replication with one another.
The next example shows an organization that has two physical locations. Here, the organization has decided to use two domains and two sites. The Main site is for the imaginedlands.com domain and the Seattle site is for the sea.imaginedlands.com domain. Again, replication occurs both within and between the sites.
One reason to create additional sites at the same physical location is to control replication traffic. Replication traffic between sites is automatically compressed, reducing the amount of traffic passed between sites by 85 to 90 percent of its original size. Because network clients try to log on to network resources within their local site first, you can use sites to isolate logon traffic as well.
It’s recommended that each site have at least one domain controller and one global catalog for client authentication. For name resolution and IP address assignment, it’s also recommended that each site have at least one Domain Name System (DNS) server and one Dynamic Host Configuration Protocol (DHCP) server. By creating multiple sites in the same physical location and establishing a domain controller, a global catalog, and a DNS and DHCP server within each site, you can closely control the logon process.
You should also design sites with other network resources in mind, including Distributed File System (DFS) file shares, certificate authorities, and Microsoft Exchange servers. You want to configure sites so that clients’ network queries can be answered within the site. If every client query for a network resource has to be sent to a remote site, there could be substantial network traffic between sites, which could be a problem over slow WAN links.
NOTE Enterprises often have branch offices where each branch office is defined as a separate site to control traffic for high-bandwidth–consuming applications rather than Active Directory replication. Here, traffic for high-bandwidth-consuming applications, such as DFS or software control and change management, is carefully managed, but authentication and global catalog traffic is allowed to cross the WAN because it is less bandwidth intensive.
Most organizations implementing Active Directory have multiple domain controllers. The domain controllers might be located in a single server room where they are all connected to a fast network, or they might be spread out over multiple geographic locations, from which they are connected over a WAN that links the company’s various office locations.
All domain controllers in the same forest—regardless of how many domain controllers there are and where the domain controllers are located—replicate information with one another. Although more replication is performed within a domain than between domains, replication between domains still occurs. The same replication model is used in both cases.
When a change is made to a domain partition in Active Directory, the change is replicated to all domain controllers in the domain. If the change is made to an attribute of an object tracked by the global catalog, the change is replicated to all global catalog servers in all domains of the forest. Similarly, if you make a change to the forestwide configuration or schema partitions, these changes are replicated to all domain controllers in all the domains of the forest.
Authentication within and between domains is also handled by domain controllers. If a user logs in to his or her home domain, the local domain controller authenticates the logon. If a user logs in to a domain other than the home domain, the logon request is forwarded through the trust tree to a domain controller in the user’s home domain.
Active Directory’s replication model is designed for consistency, but the consistency is loosely defined. By loosely defined, I mean that at any given moment the information on one domain controller can be different from the information on a different domain controller. This can happen when Microsoft Windows Server has not yet replicated the changes on the first domain controller to the other domain controller. Over time, Windows Server replicates the changes made to one domain controller to all domain controllers as necessary.
When multiple sites are involved, the replication model is used to store and then forward changes as necessary between sites . In this case, a domain controller in the site where the changes were originally made forwards the changes to a domain controller in another site. This domain controller, in turn, stores the changes and then forwards the changes to all the domain controllers in the second site. In this way, the domain controller on which a change is made doesn’t have to replicate directly with all the other domain controllers. Instead, it can rely on the store-and-forward technique to ensure that the changes are replicated as necessary.
When trying to determine site boundaries, you should configure sites so that they reflect the physical structure of your network. Use connectivity between network segments to determine where you should locate site boundaries. Areas of the network that are connected with fast connections should all be part of the same site—unless you have specific requirements for controlling replication or the logon process. Areas of the network that are connected with limited bandwidth or unreliable links should be part of different sites.
As you examine each of the organization’s business locations, determine whether placing either writeable domain controllers or read-only domain controllers as well as other network resources at that location is necessary. If you elect not to place a domain controller at a remote location, you can’t make the location a part of a separate site. Not making the location a separate site has the following advantages:
This approach also has the following disadvantages:
In the end, the decision to establish a separate site might come down to the user experience and the available bandwidth. If you have fast connections between sites—which should be dedicated and redundant—you might not want to establish a separate site for the remote business location. If you have limited bandwidth between business locations and want to maintain the user experience, you might want to establish a separate site and place domain controllers and possibly other network resources at the site. This speeds up the logon and authentication process and allows you to better control the network traffic between sites.
When you are planning the site structure , you need to understand how replication works. As discussed previously, Active Directory uses two replication models, each of which is handled differently. The intrasite replication model is used for replication within sites and is optimized for high-bandwidth connections. The intersite replication model is used for replication between sites and is optimized for limited-bandwidth connections. Before I get into the specifics of replication and the replication models, let’s look at the way replication has changed from its early implementations to the present.
The replication model used for current Windows Server versions has changed in several important ways from the model first implemented. Understanding these changes can inform the way you deploy and work with Active Directory. It can also help ensure that outdated guidance isn’t driving configuration decisions.
Originally, the smallest unit of replication is an individual attribute. Upon first examination, this seems to be what is wanted; after all, you don’t want to have to replicate an entire object if only an attribute of that object has changed. The problem with this approach is that some attributes are multivalued. That is, they have multiple values. An example is the membership attribute of a universal group. This attribute represents all the members of the universal group.
As a result of this design oversight, when you added or removed a single user from the group, you caused the entire group membership to be replicated. In large organizations a significant amount of replication traffic was often generated because universal groups might have several thousand members. Current Active Directory architecture resolves this problem by replicating only the attribute’s updated value. With universal group membership, this means that only the users you’ve added or removed are updated, rather than the entire group membership.
Intersite replication has also changed. You can turn off compression for intersite replication and enable notification for intersite replication. An improved knowledge consistency checker (KCC) allows Active Directory to support a greater number of sites. These changes affect intersite replication in the following key ways:
NOTE To turn off compression or enable notification, you need to edit the related site link or connection object. See “Configuring Advanced Site-Link Options” in Chapter 13.
Windows Server 2008 R2 and later support improved load balancing to distribute the workload more evenly among bridgehead servers. Prior to Windows Server 2008 R2, inbound connections from sites primarily targeted one bridgehead server in a site with requests even if multiple bridgeheads were available. Windows Server 2008 R2 and later have load-balancing improvements that help ensure that inbound connections are more evenly balanced when there are multiple bridgehead servers.
Because improved load balancing is a feature of the operating system and doesn’t require operating in a Windows Server 2008 R2 or higher forest or domain functional level, you can start taking advantage of the improvements simply by upgrading bridgehead servers. It’s important to point out that intrasite replication algorithms have not changed—only intersite replication algorithms have changed. This means that these improvements are do not apply to intrasite replication. Additionally, this load balancing occurs between two sites and doesn’t extend outward in a spanning tree. Thus, the KCC doesn’t take into account other sites when load-balancing connections between two sites.
The way load balancing works with multiple domains is slightly different from how it works with a single domain environment. This is because an existing connection is always used instead of a new one, even if the connection is for a different naming context. Thus, with multiple domains it might appear that load balancing isn’t working properly when in fact it is.
The KCC can still have unbalanced connections, such as when domain controllers go offline for extended periods. This unbalance can occur because the KCC does not rebalance connections when offline domain controllers come back online. Instead, the KCC prefers to maintain a stable topology rather than try to rebalance the topology.
TIP You can manually force load balancing. To do this, start by deleting the inbound intersite connections for a domain controller or site. Next, either wait for the KCC to run automatically (which will occur within 15 minutes) or manually run the KCC by entering the following command at an administrator prompt: repadmin /kcc.
Don’t run the KCC at all your sites simultaneously. If you do, inbound connections all choose the same bridgehead server. The reason this happens is that the system clock seeds the probabilistic choices for inbound connections. To avoid this problem, ensure that there is at least a one-second interval between the times you start the KCC in each site.
As with replication, the Active Directory system volume has changed in several important ways since it was first implemented. Understanding these changes can inform the way you deploy and work with Active Directory, and it also can help ensure that outdated guidance isn’t driving configuration decisions.
The Active Directory system volume (Sysvol) contains domain policy—as well as scripts used for logon, logoff, shutdown, and startup—and other related files as well as files stored within Active Directory. The way domain controllers replicate the Sysvol depends on the domain functional level. When a domain is running at the Windows Server 2008 or higher functional level, domain controllers replicate the Sysvol using Distributed File System (DFS), which replaces File Replication Service (FRS) as the preferred replication technology for Active Directory.
IMPORTANT When used with Active Directory, DFS has many advantages over FRS. With DFS, DFS Replication (DFS-R) and Remote Differential Compression (RDC) are used instead of Rsync to provide faster replication and compression. Operational overhead for managing content and replication also is significantly reduced. Additionally, DFS-R supports automated recovery from database loss or corruption, replication scheduling, and bandwidth throttling. Together, these features make DFS-R significantly more scalable than FRS.
RDC is the secret ingredient associated with enhanced DFS that allows for the granular replication of changes—this is what’s referred to when you read a vague statement that says DFS allows for the granular replication of the Sysvol. RDC enables granular replication by accurately identifying changes within and across files and transmitting only those changes to achieve significant bandwidth savings. More specifically, RDC detects insertions, removals, or rearrangements of data in files, enabling DFS-R to replicate only the changed file blocks when files are updated. Changes within or across files are called file deltas.
In addition to calculating file deltas and transferring only the differences, RDC also can copy any similar file from any client or server to another using data that is common to both computers. This further reduces the amount of the data sent and the overall bandwidth requirements for file transfers. Local differencing techniques are used to transform the old version into a new version. The differences between two versions of the file are calculated on the source domain controller and then sent to the DFS client on the target domain controller.
DFS uses the Active Directory replication topology to replicate files and folders in the Sysvol shared folders on domain controllers. The way this works is that the replication service checks with the KCC to determine the replication topology that has been generated for Active Directory replication. Then it uses this replication topology to replicate Sysvol files to all the domain controllers in a domain.
The storage techniques and replication architectures for DFS and FRS are decidedly different. A conceptual view of how DFS System is used with Active Directory on a domain controller follows. The DFS (Dfssvc.exe) stores information about stand-alone namespaces in the registry and information about domain-based namespaces in Active Directory.
The stand-alone DFS metadata contains information about the root, root target, links, link targets, and configuration settings defined for each stand-alone namespace. This metadata is maintained in the registry of the root server at HKLM\SOFTWARE\Microsoft\Dfs\Roots\Standalone.
Domain-based root servers have a registry entry for each root under KEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Dfs\Roots\Domain, but these entries do not contain the domain-based DFS metadata. When the DFS service starts on a domain controller using Active Directory with DFS, the service checks this path for registry entries that correspond to domain-based roots. If these entries exist, the root server polls the primary domain controller (PDC) emulator master to obtain the DFS metadata for each domain-based namespace and stores the metadata in memory.
In the Active Directory data store, the DFS object stores the DFS metadata for a domain-based namespace. The DFS object is created in Active Directory when you install a domain at or raise a domain to at least the Windows Server 2008 domain functional level. Active Directory replicates the entire DFS object to all domain controllers in a domain.
DFS uses a client/server architecture. A domain controller hosting a DFS namespace has both the client and server components, allowing the domain controller to perform local lookups in its own data store and remote lookups in data stores on other domain controllers. DFS uses the Common Internet File System (CIFS) for communication between DFS clients, root servers, and domain controllers. CIFS is an extension of the Server Message Block (SMB) file-sharing protocol.
When a domain controller receives a CIFS request, the SMB Service server driver (Srv.sys) passes the request to the DFS driver (Dfs.sys) and this driver, in turn, directs the request to the DFS service. Dfs.sys also handles the processing of links when they are encountered during file-system access.
When a client requests a referral for a domain-based namespace, the domain controller first checks its domain-based root referral cache for an existing referral. If the referral cache exists, the domain controller uses the cache to create the referral. If the referral cache does not exist, the domain controller locates the DFS object for that namespace and uses the metadata in the object to create the necessary referral. A referral contains a list of Universal Naming Convention (UNC) paths that the client can use. DFS uses LDAP to retrieve metadata about the domain-based namespace from Active Directory and stores this information in its in-memory cache. Various types of in-memory cache are used:
After this information is cached, DFS can provide this to clients that are requesting information about DFS namespaces. The physical structures and caches on a domain controller vary according to the type of namespace the server hosts (domain-based or stand-alone). Each root and link in a namespace has a physical representation on an NTFS volume on each domain controller. The DFS root for Active Directory corresponds to the Sysvol shared folder. If a domain controller hosts additional namespaces, the domain controller will have additional roots and links.
Active Directory replication is a multipart process that involves a source domain controller and a destination domain controller. From a high level, replication works much as shown in the next figure.
The step-by-step procedure goes like this:
374. When a user or a system process makes a change to the directory, this change is implemented as an LDAP write to the appropriate directory partition.
375. The source domain controller begins by looking up the IP address of a replication partner. For the initial lookup—or when the destination DNS record has expired—the source domain controller does this by querying the primary DNS server. Subsequent lookups can be done using the local resolver cache.
376. The source and destination domain controllers use Kerberos to mutually authenticate each other.
377. The source domain controller then sends a change notification to the destination domain controller using RPC over IP.
378. The destination domain controller sends a request for the changes using RPC over IP, including information that allows the source domain controller to determine if those changes are needed.
379. Using the information sent by the destination domain controller, the source domain controller determines what changes (if any) need to be sent to the destination domain controller. Then it sends the required changes using RPC over IP.
380. The destination domain controller uses the replication subsystem to write the changes to the directory database.
NOTE For intersite replication, two transports are available: RPC over IP and Simple Mail Transfer Protocol (SMTP). With this in mind, you could also use SMTP as an alternate transport. SMTP uses TCP port 25.
As you can see from this overview, Active Directory replication depends on the following key services:
These Windows services must be functioning properly to allow directory updates to be replicated. Active Directory also uses either FRS or DFS to replicate files in the System Volume (Sysvol) shared folders on domain controllers. The User Datagram Protocol (UDP) and TCP ports used during replication are similar whether or not FRS or DFS is used. Table 12-1 summarizes the ports that are used.
Table 12-1. Ports used during Active Directory Replication
SERVICE/COMPONENT |
PORT |
|
|
UDP |
TCP |
LDAP |
389 |
389 |
LDAP Secure Sockets Layer (SSL) |
|
686 |
Global Catalog (LDAP) |
|
3268 |
Global Catalog (LDAP, SSL) |
|
3269 |
Kerberos version 5 |
88 |
88 |
DNS |
53 |
53 |
RPC |
|
Dynamic |
RPC endpoint mapper with DFS |
|
135 |
Server Message Block (SMB) over IP |
445 |
445 |
SMTP |
|
25 |
Kerberos Change/Set Password |
464 |
464 |
The Active Directory replication model is designed to ensure that there is no single point of failure. In this model every domain controller can access changes to the database and replicate those changes to all other domain controllers. When replication occurs within a domain, the replication follows a specific model that is very different from the replication model used for intersite replication.
With intrasite replication, the focus is on ensuring that changes are rapidly distributed. Intrasite replication traffic is not compressed, and replication is designed so that changes are replicated almost immediately after a change has been made. The main component in Active Directory responsible for the replication structure is the KCC. One of the main responsibilities of the KCC is to generate the replication topology—that is, the way replication is implemented.
As domain controllers are added to a site, the KCC configures a ring topology for intrasite replication with pull replication partners. Why use this model? For the following reasons:
The KCC uses these models to create a replication ring. As domain controllers are added to a site, the size and configuration of this ring change. When there are at least three domain controllers in a site, each domain controller is configured with at least two incoming replication connections. As the number of domain controllers changes, the KCC updates the replication topology.
When a domain controller is updated, it waits approximately 15 seconds before initiating replication. This short wait is implemented in case additional changes are made. The domain controller on which the change is made notifies one of its partners, using an RPC, and specifies that changes are available. The partner can then pull the changes. After replication with this partner completes, the domain controller waits approximately three seconds and then notifies its second partner of changes. The second partner can then pull the changes. Meanwhile, the first partner is notifying its partners of changes as appropriate. This process continues until all the domain controllers have been updated.
The 15-second delay for replication applies to all current implementations of Active Directory. However, the delay is overridden to allow for the immediate replication of priority changes. Priority (urgent) replication is triggered if you perform one of the following actions:
Urgent replication means that there is no delay to initiate replication. Note that all other changes to user and computer passwords are handled by the designated PDC emulator in a domain. When a user changes a normal user or computer password, the domain controller to which that user is connected immediately sends the change to the PDC emulator. This way the PDC emulator always has the latest password for a user. This is why the PDC emulator is checked for a new password if a logon fails initially. After the new password is updated on the PDC emulator, the PDC emulator replicates the change using normal replication. The only exception is when a domain controller contacts the PDC emulator requesting a password for a user. In this case the PDC emulator immediately replicates the current password to the requesting domain controller so that no additional requests are made for that password.
The next example shows a ring topology that a KCC would construct if there were three domain controllers in a site. Here, replication is set up as follows:
If you make changes to DC1, DC1 notifies DC2 of the changes. DC2 then pulls the changes. After replication completes, DC1 notifies DC3 of the changes. DC3 then pulls the changes. Because all domain controllers in the site have now been notified, no additional replication occurs. However, DC2 still notifies DC3 that changes are available. DC3 does not pull the changes, however, because it already has them.
Domain controllers track directory changes using update sequence numbers (USNs). Any time a change is made to the directory, the domain controller assigns the change a USN. Each domain controller maintains its own local USNs and increments their values each time a change occurs. The domain controller also assigns the local USN to the object attribute that changed. Each object has a related attribute called uSNChanged. The uSNChanged attribute is stored with the object and identifies the highest USN that has been assigned to any of the object’s attributes.
To see how this works, consider the following example. The local USN for DC1 is 125. An administrator connected to DC1 changes the password on a user’s account. DC1 registers the change as local USN 126. The local USN value is written to the uSNChanged attribute of the user object. If the administrator next edits a group account and changes its description, DC1 registers the change as local USN 127. The local USN value is written to the uSNChanged attribute of the Group object.
NOTE With replication there is sometimes a concern that replication changes from one domain controller might overwrite similar changes made to another domain controller. However, because object changes are tracked on a per-attribute basis, this rarely happens. It’s very unlikely that two administrators would change the exact same attributes of an object at the exact same time. By tracking changes on a per-attribute basis, Active Directory effectively minimizes the possibility of any conflict.
Each domain controller tracks not only its local USN, but also the local USNs of other domain controllers, in a table referred to as an up-to-dateness vector. During the replication process, a domain controller that is requesting changes includes its up-to-dateness vector. The receiving domain controller can then compare the USN values to those it has stored. If the current USN value for a particular domain controller is higher than the stored value, changes associated with that domain controller need to be replicated. If the current value for a particular domain controller is the same as the stored value, changes for that domain controller do not need to be replicated.
Because only necessary changes are replicated, this process of comparing up-to-dateness vectors ensures that replication is very efficient and that changes propagate only when necessary. The up-to-dateness vectors are, in fact, the mechanism that enables domain controllers with redundant connections to know that they’ve already received the necessary updates.
IMPORTANT Several types of replication changes have priority. If you make changes to object attributes in the schema, these changes take precedence over most other changes. In this case, Active Directory blocks the replication of normal changes and replicates the schema changes. Active Directory continues to replicate schema changes until the schema configuration is synchronized on all domain controllers in the forest. This ensures that schema changes are applied rapidly. Still, it’s a good idea to make changes to the schema during off-hours because schema changes need to propagate throughout the forest before other changes, such as resetting passwords, can be made to Active Directory.
While intrasite replication focuses on speed, intersite replication focuses on efficiency. The primary goal of intersite replication is to transfer replication information between sites while making the most efficient use of the available resources. With efficiency as a goal, intersite replication traffic uses designated bridgehead servers and a default configuration that is scheduled rather than automatic and compressed rather than uncompressed:
As discussed previously, there are two key ways to change intersite replication:
Regardless of the site-link configuration, replication traffic is sent through dedicated bridgehead servers rather than through multiple replication partners. When changes are made to the directory in one site, those changes replicate to the other site through the designated bridgehead servers. The bridgehead servers then initiate the replication of the changes exactly as was discussed earlier in this chapter, except that the servers can use SMTP instead of RPC over IP if you use SMTP as a transport. Thus, intersite replication is really concerned with getting changes from one site to another across a site link.
The next example shows intersite replication using a single dedicated bridgehead server on each side of a site link. In this example, DC3 is the designated bridgehead server for Site 1 and DC4 is the designated bridgehead server for Site 2.
As the figure shows, replication is set up as follows:
If changes are made to DC1 in Site 1, DC1 notifies DC2 of the changes. DC2 then pulls the changes. After replication completes, DC1 notifies DC3 of the changes. DC3 then pulls the changes. Because all domain controllers in Site 1 have now been notified, no additional replication occurs within the site. However, DC2 still notifies DC3 that changes are available. DC3 does not pull the changes, however, because it already has them.
According to the site-link configuration between Site 1 and Site 2, DC3 notifies DC4 that changes are available. DC4 then pulls the changes. Next DC4 notifies DC5 of the changes. DC5 then pulls the changes. After replication completes, DC4 notifies DC6 of the changes. DC6 then pulls the changes. Because all domain controllers in Site 2 have now been notified, no additional replication occurs. However, DC5 still notifies DC6 that changes are available. DC6 does not pull the changes, however, because it already has the changes.
So far, I’ve talked about designated bridgehead servers but haven’t said how bridgehead servers are designated. That’s because it’s a rather involved process. When you set up a site, the KCC on a domain controller that Active Directory has designated the ISTG is responsible for generating the intersite topology. Each site has only one ISTG, and its job is to determine the best way to configure replication between sites.
The ISTG does this by identifying the bridgehead servers that are to be used. Replication between sites is always sent from a bridgehead server in one site to a bridgehead server in another site. This ensures that information is replicated only once between sites. As domain controllers are added and removed from sites, the ISTG regenerates the topology automatically.
The ISTG also creates the connection objects that are needed to connect bridgehead servers on either side of a site link. This is how Active Directory logically represents a site link. The ISTG continuously monitors connections and will create new connections when a domain controller acting as a designated bridgehead server is no longer available. In most cases there will be more than one designated bridgehead server, and I’ll discuss why in the next section, “Replication Rings and Directory Partitions.”
NOTE You can manually configure intersite replication in several ways. In addition to using the techniques discussed previously for scheduling, notification, and compression, you can also configure site link costs, configure connection objects manually, and designate preferred bridgehead servers.
The KCC is responsible for generating the intrasite replication topology, and the ISTG uses the KCC to generate the intersite replication topology. The KCC always configures the replication topology so that each domain controller in a site has at least two incoming connections if possible, as already discussed. The KCC also always configures intrasite replication so that each domain controller is no more than three hops from any other domain controller. This also means that maximum replication latency, the delay in replicating a change across an entire site, is approximately 45 seconds for normal replication.
When there are two domain controllers in a site, each domain controller is the replication partner of the other. When there are between three and seven domain controllers in the domain, each domain controller has two incoming connections and two replication partners. The figure that follows shows the replication topology for TV Press’s Sacramento campus. Here, the network is spread over two buildings that are connected with high-speed interconnects. Because the buildings are connected over redundant high-speed links, the organization uses a single site with three domain controllers in each building. The replication topology for the six domain controllers as shown ensures that no domain controller is more than three hops from any other domain controller.
When the number of domain controllers increases beyond seven, additional connection objects are added to ensure that no domain controller is more than three hops from any other domain controller in the replication topology. To see an example of this, consider the next figure.
Here, TV Press has built a third building that connects its original buildings to form a U-shaped office complex. The administrators have placed two new domain controllers in Building 3. As a result of adding the additional domain controllers, some domain controllers now have three replication partners.
At this point you might be wondering what role, if any, directory partitions play in the replication topology. After all, from previous discussions you know that Active Directory has multiple directory partitions and that those partitions are replicated in the following ways:
In previous discussions I didn’t want to complicate things by adding a discussion of partition replication. From a logical perspective, partitions do play an important role in replication. Replication rings, the logical implementation of replication, are based on the types of directory partitions that are available. The KCC generates a replication ring for each kind of directory partition, each of which has specific replication partners. Keep the following in mind:
Replication rings are implemented on a per-directory partition basis. There is one replication ring per directory partition type, and some rings include all the domain controllers in a forest, all the domain controllers in a domain, or only those domain controllers using application partitions.
When replication rings are within a site, the KCC on each domain controller is responsible for generating the replication topology and keeping it consistent. When replication rings go across site boundaries, the ISTG is responsible for generating the replication topology and keeping it consistent. Because replication rings are merely a logical representation of replication, the actual implementation of replication rings is expressed in the replication topology by using connection objects. Whether you’re talking about intrasite or intersite replication, there is one connection object for each incoming connection. The KCC and the ISTG do not create additional connection objects for each replication ring. Instead, they reuse connection objects for as many replication rings as possible.
When you extend the reuse of connection objects to the way intersite replication is performed, the following is how multiple bridgehead servers might be designated.
Here, the domain, schema, and configuration partitions replicate from Site 1 to Site 2 and vice versa using the connection objects between DC3 and DC5. A special application partition is replicated from Site 1 to Site 2 and vice versa using the connection objects between DC2 and DC6.
Typically, each site has a designated bridgehead server for replicating the domain, schema, and configuration directory partitions. Other types of directory partitions might be replicated between sites by domain controllers that host these partitions. For example, if two sites have multiple domain controllers and only a few have application partitions, a connection object might be created for the intersite replication of the application partition.
The global catalog partition is a special exception. The global catalog is built from all the domain databases in a forest. Each designated global catalog server in a forest must get global catalog information from the domain controllers in all the domains of the forest. This means that a global catalog server must connect to a domain controller in every domain and there must be an associated connection object to do this. Because of this, global catalog servers are another reason for having more than one designated bridgehead server per site.
The previous figure is an example of how replication might work for a more complex environment that includes domain, configuration, and schema partitions as well as DNS and global catalog partitions. Here, the domain, schema, and configuration partitions replicate from Site 1 to Site 2 and vice versa using the connection objects between DC3 and DC5. The connection objects between DC1 and DC4 replicate the global catalog partition from Site 1 to Site 2 and vice versa. In addition, the connection objects between DC2 and DC6 replicate the DNS partitions from Site 1 to Site 2 and vice versa.
Site design depends on your organization’s networking infrastructure. As you set out to implement an initial site design, you should start by mapping your organization’s existing network topology. Any time you plan to revise your network infrastructure, you must also plan the necessary revisions to your existing site design.
Although site design is relatively independent from domain structure, the replication topology depends on how available domain controllers are and how they are configured. The KCC running on each domain controller monitors domain controller availability and configuration, and it updates replication topology as changes occur. The ISTG performs similar monitoring to determine the best way to configure intersite replication. This means that as you implement or change the domain controller configuration, you might change the replication topology.
To develop a site design, you should start by mapping your existing network architecture . Be sure to include all the business locations in the organization that are part of the forest or forests for which you are developing a site plan. Document the subnets on each network segment and the connection speed on the links connecting each network segment. Keep the following in mind:
Because site design and network infrastructure are so closely linked, you’ll want to work closely with your organization’s network administrators. If you wear both hats, start mapping the network architecture by listing each network location, the subnets at that location, and the links that connect the location. For an organization with its headquarters in Chicago and four regional offices—in Seattle, New York, Los Angeles (LA), and Miami—this information might come together as follows:
NOTE Keep in mind business Internet speeds generally are measured in Megabytes per second (Mbps) and Kilobytes per second (Kbps) instead of consumer speeds measured in megabits per second (mbps) and kilobits per second (kbps). The difference between bytes and bits is an important one, as a factor of 8 separates the two. An internet speed of 100 mbps is 12.5 Mbps.
Notice that I start with the hubs and work my way to the central office. This way, when I finally make this entry, the multiple connections to the central office are all accounted for.
I then use the table to create a diagram similar to the one shown in the next figure, where I depict each network and the connections between them. I also noted the subnets at each location. Although it’s also helpful to know the number of users and computers at each location, this information alone isn’t enough to help you determine how links connecting sites are used. The only certain way to know that is to monitor the network traffic going over the various links.
After you map the network structure, you are ready to create a site design. Creating a site design involves the following steps:
381. Mapping the network structure to site structure
382. Designing each individual site
383. Designing the intersite replication topology
384. Considering the impact of site-link bridging
385. Planning the placement of servers in sites
Each of these steps is examined in the sections that follow.
To map the network structure to the site structure, start by examining each network location and the speed of the connections between those locations. In general, if you want to make separate network locations part of the same site, the sites should have at least 512 Kbps (4096 kbps) of available bandwidth. If the sites are in separate geographic locations, I also recommend that the network locations have redundant links for fault tolerance.
These recommended speeds are for replication traffic only, not for other user traffic. Smaller organizations with fewer than 100 users at branch locations might be able to scale down to dedicated 128-Kbps or 256-Kbps links. Larger organizations with 250 or more users at branch locations might need to scale up. In both scenarios, keep in mind that the cost of a dedicated T1 (1.54 Mbps) has dropped substantially in the past few years. For example, in many US market, the cost of a dedicated T1 is about $200 a month.
Following the previous example, the Chicago-based company would probably be best served by having separate sites at each network location. With this in mind, the site-to-network mapping is as shown in the next figure.
By creating the additional sites at the other network locations, you help control replication over the slow links, which can significantly improve the performance of Active Directory. More good news is that sites are relatively low-maintenance once you configure them, so you get a significant benefit without a lot of additional administration overhead.
After you determine how many sites you will have, you next need to consider the design of each site. A key part of the site design has to do with naming the sites and identifying the subnets that are associated with each site. Site names should reflect the physical location of the site. The default site created by Active Directory is Default-First-Site-Name, and most site names should follow a similar naming scheme. Continuing the example, you might use the following site names:
I used dashes instead of spaces, following the style Active Directory uses for the default first site. I named the sites City-First-Site rather than City-Site to allow for easy revision of the site architecture to include additional sites at each location. Now, if a location receives additional sites, the naming convention is very clear, and it’s also very clear that if you have a Seattle-First-Site, a Seattle-Second-Site, and a Seattle-Third-Site, these are all different sites at the Seattle location.
To determine the subnets that you should associate with each site, use the network diagram developed in the previous section. It already has a list of the subnets. In your site documentation, simply note the IP subnet associations that are needed and update your site diagram to include the subnets.
After you name the sites and determine subnet associations, you should design the intersite replication topology. You do this by planning the details of replication over each link designated in the initial site diagram. For each site link, plan the following components:
Typically, you want replication to occur at least every 180 minutes, 24 hours a day, 7 days a week. This is the default replication schedule. If you have limited bandwidth, you might need to alter the schedule to allow user traffic to have priority during peak usage times. If bandwidth isn’t a concern or if you have strong concerns about keeping branch locations up to date, you might want to increase the replication frequency. In all cases, you should, if possible, monitor any existing links to get a sense of the bandwidth utilization and the peak usage periods.
Calculating the link cost can be a bit complicated. When there are multiple links between locations, you need to think carefully about the appropriate cost of each link. Even if there is only one link between all your sites now, you should set an appropriate link cost now to ensure that if links are added between locations, all the links are used in the most efficient way possible.
Valid link costs range from 1, which assigns the highest possible preference to a link, to 99999, which assigns the lowest possible preference to a link. When you create a new link, the default link cost is set to 100. If you set all the links to this cost, all the links have equal preference for replication. But would you really want replication to go over a 128-Kbps link when you have a 512-Kbps link to the same location? Probably not.
In most cases the best way to set the link cost is to assign a cost based on the available network bandwidth over a link. Table 12-2 provides an example of how this could be done.
Table 12-2. Establishing Link Cost for Replication
AVAILABLE BANDWIDTH |
LINK COST |
PREFERENCE |
10 gigabits per second (Gbps) to 2 Gbps |
1 |
Top |
2 Gbps to 1 Gbps |
2 |
Extremely high |
1 Gbps to 512 megabits per second (Mbps) |
4 |
Very high |
512 Mbps to 256 Mbps |
10 |
Moderately High |
256 Mbps to 100 Mbps |
20 |
High |
100 Mbps to 10 Mbps |
40 |
Above Normal |
10 Mbps to 1.544 Mbps |
100 |
Normal |
1.544 Mbps to 512 Kbps |
200 |
Below normal |
512 Kbps to 256 Kbps |
400 |
Moderately Low |
256 Kbps to 128 Kbps |
800 |
Low |
128 Kbps or less |
1600 |
Very Low |
You can use the costs in the table to assign costs to each link you identified in your site diagram. After you do this, update your site diagram so that you can determine the route that is used for replication if all the links are working. Your site diagram should now show the names of the sites, the associated subnets, and the cost of each link. An example follows.
By default, Active Directory automatically configures site-link bridges, which makes links transitive between sites in much the same way that trusts are transitive between domains. When a site is bridged, any two domain controllers can make a connection across any consecutive series of links. The site-link-bridge cost is the sum of all the costs of the links included in the bridge. Let’s calculate the site-link-bridge costs using the links shown in the previous figure. Because of site-link bridges, the domain controllers at the Chicago headquarters have two possible routes for replication to each of the branch office locations. The costs of these routes are as follows:
Knowing the costs of links and link bridges, you can calculate the effects of a network link failure. In this example, if the primary link between Chicago and Seattle went down, replication would occur over the Chicago–LA–Seattle site-link bridge. In this example it’s relatively straightforward, but if you introduce additional links between network locations, the scenarios become very complicated very quickly.
The network topology used in the previous example is referred to as a hub-and-spoke design. The headquarters in Chicago is the hub, and the rest of the offices are spokes. Automatic site-link bridging works well with a hub-and-spoke design. It doesn’t work so well when you have multiple hubs. Consider the example shown in the next figure. In this example, Chicago is the main hub, but because Seattle and LA have a spoke, they are also considered hubs.
Site-link bridging can have unintended consequences when you have multiple hubs and spokes on each hub. Here, when the bridgehead servers in the Chicago site replicate with other sites, they replicate with Seattle, New York, LA, and Miami bridgehead servers as before, but they also replicate with the Vancouver and San Diego bridgehead servers across the site bridge from Chicago–Seattle–Vancouver and from Chicago–LA–San Diego. This means that the same replication traffic could go over the Chicago–Seattle and the Chicago–LA links twice. This can happen because of the rule of three hops for optimizing replication topology.
The repeat replication over the hub links becomes worse as you add spokes. Consider the next example. Here, the LA hub has connections to sites in Sacramento, San Diego, and San Francisco. As a result of site-link bridging, the same replication traffic could go over the Chicago–LA links four times. This happens because of the rule of three hops for optimizing replication topology.
The solution to the problem of repeat replication traffic is to disable automatic site bridging. Unfortunately, the automatic bridging configuration is all or nothing. This means that if you disable automatic site-link bridging and still want to bridge some site links, you must configure those bridges manually. You can enable, disable, and manually configure site-link bridges as discussed in the section entitled “Configuring Site-Link Bridges” in Chapter 13.
When you finish configuring site links, you should plan the placement of servers in the sites. Think about which types of domain controllers and how many of each will be located in a site. Answer the following questions:
Think about which Active Directory partitions will be replicated between the sites as a result of the domain controller placement. Also think about any additional partitions that might need to be replicated to a site. Answer the following questions:
By answering all these questions, you know what servers will be placed in each site, as well as what information will be replicated between sites. Don’t forget about dependent services for Active Directory. At a minimum, each site should have at least one domain controller, a global catalog, and DNS. This configuration allows intrasite replication to occur without having to go across site links for dependent services. To improve the user experience, keep the following in mind: