Using jitter to distinguish closely packed data points

Sometimes, when working with large datasets, we might find that a lot of data points on a scatter plot overlap each other. In this recipe, we will learn how to distinguish between closely packed data points by adding a small amount of noise with the jitter() function.

Getting ready

All you need for the next recipe is to type it in the R prompt as we will use some base library functions to define a new error bar function. You can also save the recipe code as a script so that you can use it again later on.

How to do it...

First, let's create a graph that has a lot of overlapping points:

x <- rbinom(1000, 10, 0.25)
y <- rbinom(1000, 10, 0.25)
plot(x,y)

How to do it...

Now, let's add some noise to the data points to see whether there are overlapping points:

plot(jitter(x), jitter(y))

How to do it...

How it works...

In the first graph, we plotted 1,000 random data points generated with the rbinom() function. However, as you can see in the first graph, only a few data points are visible because there are multiple data points in the exact same location. Then, when we plotted the points by applying the jitter() function to the x and y values, we saw a lot more of the thousand points. We can also see that most of the data is in the range of x and y values of 2 to 4, respectively.