Centralized systems

A centralized version control system is based on a single server that holds the files and lets people check in and check out the changes that are made to those files. The principle is quite simple—everyone can get a copy of the files on his/her system and work on them. From there, every user can commit his/her changes to the server. They will be applied and the revision number will be raised. The other users will then be able to get those changes by synchronizing their repository copy through an update.

As the following diagram shows, a repository evolves through all the commits, and the system archives all revisions into a database in order to be able to undo any change, or provide information on what has been done and by whom:

Figure 1

Every user in this centralized configuration is responsible for synchronizing his/her local repository with the main one, in order to get the other users' changes. This means that some conflicts can occur when a locally modified file has been changed and checked in by someone else. A conflict resolution mechanism is carried out, in this case on the user system, as shown in the following diagram:

Figure 2

The following steps will help you to understand this process better:

Joe checks in a change.
Pamela attempts to check in a change on the same file.
The server complains that her copy of the file is out of date.
Pamela updates her local copy. The version control software may or may not be able to merge the two versions seamlessly (that is, without a conflict).
Pamela commits a new version that contains the latest changes made by Joe and her own.

This process is perfectly fine on small-sized projects that involve a few developers and a small number of files, but it becomes problematic for bigger projects. For instance, a complex change involves a lot of files, which is time-consuming, and keeping everything local before the whole work is done is unfeasible. The following are some problems of such an approach:

The user may keep his/her changes in private for a long time without a proper backup
It is hard to share work with others until it is checked in, and sharing it before it is fully done would leave the repository in an unstable state, and so the other users would not want to share

A centralized VCS can resolve this problem by providing branches and merges. It is possible to fork from the main stream of revisions to work on a separated line, and then to get back to the main stream.

In the following diagram, Joe starts a new branch from revision 2 to work on a new feature. The revisions are incremented in the main stream and in his branch, every time a change is checked in. At revision 7, Joe has finished his work and committed his changes into the trunk (the main branch). This process often requires some conflict resolution.

But in spite of their advantages, a centralized VCS has the following pitfalls:

Branching and merging is quite hard to deal with. It can become a nightmare.
Since the system is centralized, it is impossible to commit changes offline. This can lead to a huge and single commit to the server when the user gets back online.
Lastly, it doesn't work very well for projects such as Linux, where many companies permanently maintain their own branch of the software and there is no central repository that everyone has an account on.

For the latter, some tools are making it possible to work offline, such as SVK, but a more fundamental problem is how the centralized VCS works:

Figure 3

Despite these pitfalls, centralized VCSes are still quite popular among many companies, mainly due to the inertia of corporate environments. The main examples of centralized VCSes used by many organizations are Subversion (SVN) and Concurrent Version System (CVS). The obvious issues with a centralized architecture for version control systems is the reason why most of the open source communities have switched already to the more reliable architecture of Distributed VCS (DVCS).