Distributing rows

As said, when you split a stream, you can either copy or distribute the rows. Copying is about creating copies of the whole dataset and sending each of them to each output stream. Distributing means that the rows of the dataset are distributed among the destination steps. Those steps run in separate threads, so distribution is a way to implement parallel processing.

When you distribute, the destination steps receive the rows in a round-robin fashion. For example, if you have three target steps, as for example, the three calculators in the following screenshot the first row of data goes to the first target step, the second row goes to the second step, the third row goes to the third step, the fourth row goes to the fourth step, and so on.

Visually, when the rows are distributed, the hops leaving the steps from which you distribute are plain; they don't change their look and feel, as shown next:

Distributing rows
Throughout the book, we will always use the Copy option. To avoid being asked for the action to take every time you create more than one hop leaving a step, you can set the Copy option by default. You do it by opening the PDI options window (Tools | Options... from the main menu) and unchecking the option Show Copy or Distribute dialog:. Remember that to see the change applied, you will have to restart Spoon.

Once you have changed this option, the default method is Copy rows. If you want to distribute rows, you can change the action by right-clicking on the step from which you want to copy or distribute, selecting Data Movement... in the contextual menu that appears, and then selecting the desired option.

Another way to distribute is to change the number of copies of a step. The following step-by-step instructions create three copies of the Calculator step. The result is technically equivalent to the previous example:

  1. Right-click a step.
  2. From the contextual menu, select Change Number of Copies to Start... .
  3. As Number of copies, specify 3. Close the window. The look and feel change as follows:
Changing the number of copies of a step