Chapter 8. Collaborating with Git and GitHub

The following topics will be covered in this chapter:

Data analytics projects with R can sometimes get very complex, especially when we have to work on our analysis over a longer period of time. To keep track of our changes and our progress in the project, it is important to use a version control system that can support us on these tasks. The best known of these version control approaches is Git. It helps us annotate every change we make to our code. This is also very helpful when we collaborate with other people, or when other people have to read and understand our code later on, and also when they need to understand the steps of its development. Git describes itself as a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency. You can read about it at https://git-scm.com/.

Gits can be created on servers, or on our local machine. Their distributed nature lets you sync commits with other machines. An alternative to creating your own, is using a hosted version control system. To do this, we can create an account on platforms such as GitHub, Bitbucker, or GitLab. Most of them offer free accounts, and if we take the example of GitHub, everybody can create repositories with an unlimited number of collaborators and public projects. This means that everybody can see our code on the website and use it. If we want to have private repositories, we have to buy a plan.

The Git system comes with some terminology. We do not have to know everything, but we should take a look at the fundamental elements of this version control system.