Team Efficiency

Throughout this book I’ve emphasized getting the best value from the work that you do, whether carefully choosing the most valuable project or by making sure that the finished version of that project is appreciated to the fullest extent by the widest possible group of users.

In Chapter 6, we talked about marketing the data science team’s work to others in the organization. It could be said that we were looking at the beginnings of building a team brand.

In terms of this book, it was natural to look at the team branding first, as it flows on naturally from the projects you’ve already completed. But it’s also reasonable to wonder what happened to making the team work well together in the first place. This chapter should address that.

Sometimes it seems that the word “team” is a punchline from a comedy about corporate life like The Office or Office Space. By contrast, rather than meaning, for example, enforced “fun” with co-workers, in this chapter I want to talk about making the best use of the people around you. This is intuitively achieved by communicating better with the people around you and finding common ways to do things.

Learning from Your Work

Data scientists are often focused on technical learning, but the human element should not be neglected. Throughout this book we have discussed different methods for ensuring that the human element is not neglected in terms of communicating with the people around a data science team more effectively. However, we haven’t talked about how to work together as a team for the best results explicitly.

At the same time, we should take note of some of the indirect ways of improving team efficiency that we have touched upon. Some of the most important of these were in the last chapter, where we discussed ways of promoting the lessons learned from your data science process. It is certainly the case that ensuring everyone around you learns from your work as best as possible is one of the most important aspects of working effectively in a team.

However, while it wasn’t said outright, the kind of learning that was going on as a result of these projects was usually technical learning. Implicit in our discussion was that most of what we were promoting was the direct result of your research or analysis.

However, it can be useful to make a point of noticing the other aspects of what you are trying to achieve—that is the human elements of what you have been doing. If you don’t set out to notice these kinds of lessons at the beginning of your process, you are likely to fail to take time to recognize when you have learned from a particular project.

To illustrate what I mean by this, consider the life cycle of projects as discussed throughout this book. Throughout this book I’ve referred to the way projects begin with a customer or client who has a problem through a process of understanding the problem correctly, proposing and implementing a solution, and then documenting and promoting the solution that has been implemented.

In Chapter 6, when we talked about promoting the data science team by sharing what you learned along the way, you could have reasonably inferred that what I was talking about was the technical findings that come directly from the data analysis and modeling process or otherwise from the process of attempting to implement the solution. Indeed those are the right things to share with the rest of the organization when you are trying to build a brand for the data science team.

However, at least within the data science team itself, the lessons you learn about how to talk to certain people or groups of people, or a new way to write a great whitepaper, are just as significant and useful as either the direct results of analysis or a technical lesson, such as a new way to prepare some type of variable.

Unfortunately, these lessons are less often captured in formal documentation or presented back to the organization via training sessions compared to technical lessons. There could be a few different reasons for this situation, but it’s likely that the perception that creating documentation and presentations on human-centered issues is seen as more difficult, especially by technically oriented people.

If you are an Agile team that does regular retrospectives, you’ve already got a regular process that exists in part to make sure certain kinds of undocumented communication occur. The danger can be that sometimes the focus on projects means that some of the biggest lessons can be missed.

The set up for a retrospective meeting is straightforward. You look back at recent activity and list what went well and what went wrong. In an Agile context, the recent activity often means during the last sprint, you don’t have to follow an Agile workflow in order to do retrospectives. Though Agile was the first to formalize it with a name and the idea has become significantly more popular as a result of Agile, it’s a universally good idea for everyone.

A critical reason that retrospectives sometimes don’t deliver is that data scientists (or software developers) aren’t naturally comfortable talking about the human side of delivering projects and often find clever ways to turn discussions meant to be about the human side into technical discussions.

A common way that things can go awry in this way is that people overuse the project or Agile terminology so that a discussion that ought to be about a human issue such as poor communication instead remains fixed, or drifts into a discussion about the technical outcome. For example, where a communication problem has resulted in someone receiving the wrong information, leading to a technical problem, the communication problem is the root cause of the problem but can be overlooked, and a thorough discussion of the technical consequences can be substituted for a more productive discussion of how the communication fell apart.

If you aren’t in an Agile environment and therefore don’t have retrospectives, or are in an Agile environment and haven’t adopted them yet, it doesn’t mean that you’ve missed out on having a retrospective. A retrospective doesn’t depend on having an Agile environment. You might even be able to have a more effective retrospective if you aren’t tethered to the Agile terminology for the reasons mentioned earlier.

The ways of holding a retrospective that you can find in Internet guides are effectively just the base you can use to make a retrospective that works for you. In each case, you are essentially presented with different ways to facilitate a somewhat guided brainstorming session. Within that context, you have a lot of room to guide the brainstorming where you think it needs to go the most.

Don’t be satisfied with retrospective by numbers. That is, if you ask the team to come up with the things that worked and the things that didn’t, and after a little bit between the plus column and the minus column, all you’ve got is the same old stuff you had last time couched in safe Agile jargon or the safe jargon of the last training session your company paid for—reject it and ask for more.

If it’s not quite as bad as that, but the only problems are purely technical, give them a few prompts for the human side of things. It could work the other way too—if the problems are too much on the human side (which could well mean that the retrospective has descended into a simple blame game).

The retrospective, in fact, is an expandable format, and similar to a master stock or 12 bar blues, it can be done in different ways to suit what’s needed by the people using it.

More than anything else, retrospective is a platform for the most important role a manager can play—the manager as a coach, where the word coach itself is really another word for teacher.

In fact, although the idea of coaching can sometimes conjure up “official” or company-mandated one-on-one coaching sessions, the team coaching session can sometimes be far more effective. Consider sports teams (as much as the comparison to business teams is over-used). A huge amount of the interaction and work done by a coach is done with the team as a group rather than one on one. There is a huge opportunity to improve the team by using the sessions as opportunities for group coaching, most obviously by identifying behavior you want the whole team to adopt.

A secondary benefit to taking a conscious decision to lead the discussion in a retrospective is that you can call out for compliments examples of people improving the team by doing “glue work.” “Glue work” has been loosely defined as work that is essential for team success, but not measured by the organization’s standard metrics.¹ This sort of work can easily go unnoticed, and productive team members can go without receiving the due credit for the effort they apply to increasing the whole team’s productivity.

This may be a different way of doing things for some whose instinct is to hang back and let the group’s thoughts flow. There is a time for this, but there is also a time for ensuring that the right issues are not just discussed but that the discussions result in practical suggestions.

Therefore, there is room for someone facilitating a retrospective discussion to join in and guide the discussion toward the most important and relevant issues. Not only that, but room to challenge what comes out of the discussion in order to ensure that what’s being decided is practical.

The end goal, of course, is that what you discover through the retrospective can be applied to what you do to change your practices to get better results. When you do that, you will want to ensure that the new practices are used as often by your team as practical. That means that you need to find ways to standardize what you do.

A Shared Way of Doing Things

One of the most common lessons for improving a team’s effectiveness is to have a common purpose that is understood, as far as possible, the same way by all the members of the group. This can be challenging for data scientists given the lack of an agreed definition of what a data scientist is to begin with. However, within a specific organization, you have at least some chance of being able to establish what a data scientist is in your immediate context.

Even at the practical level of a shared understanding of data science practices, the diverse range of backgrounds that data scientists may have creates a heightened need to ensure everyone in the team shares the same understanding of frequently used terminology, and the same overall approach.

Intuitively, the best way to share a vision is to create it together. Many guides to team cohesion suggest brainstorming the team vision together. We covered some of this in Chapter 1 when we discussed creating a team mission.

However, the practical side still needs attention. In many industries, for example commonly across the manufacturing industry, highly standardized processes are created and enforced from the top of organizations down. It often happens that they are then resented by operators who are expected to use them.

The situation is different in relation to a data science team. These standard processes are only meant to be used by the relatively small number of people in a data science team. The relatively small number of people also means that, unlike the case that often applies to a large manufacturer, it is very practical to choose the standard practices as a team.

The key advantage of being able to standardize a process is that it reduces variability. In a manufacturing process, several other advantages flow on from that, but in our data science context one that is useful is predictability. By having a predictable process you know what you are going to get and how long it takes to get it. These advantages are highly important again for the process of winning trust—the ability to be predictable means that you can make promises knowing you can keep them.

Note that you don’t need to be limited by the meaning of the word standardization. That is, it can be wrongly assumed that standardization simply means creating a “black letter” process that everyone follows the same way. However, there are ways to standardize that don’t take that approach.

Consider, for example, the Agile Manifesto,² which is expressed as a series of preferences, rather than as predetermined choices. This idea can be extended to other areas to mean “try this first”—for example, you could develop a guideline in modeling that you always try logistic regression first, and then move toward more complex and less transparent models.

Another way to have “soft” standardization is to create boundaries. To use a similar example to the previous one, you could have a rule that for a certain class of problems you will never use k-nearest neighbors (or some other algorithm that you have that doesn’t produce good results for your typical kinds of data).

Other ways that you can standardize effectively within a data science environment might include the following:

Standard definitions of target variables: For example, do you have a standard place to start when considering targets based on time windows that makes sense for your organization?
Standard terminology: Does Jill say independent variable and Joe say input?
Standard Tools: You’ve probably decided on a standard platform/language, for example, R or Python or a commercial package—but if you’ve gone for R or Python have you standardized on preferred libraries for particular common tasks?

All of these things will stick more easily if you decide them as a group. It’s also a great standard response to things that come up regularly at retrospectives to put the question of “Do we need to standardize on that issue?” That way you’ve got a live example in front of you.

Standardization is often more honored in the breach than the observance—people agree that it’s generally a good thing but don’t do it because they have an overly stereotypical idea of what it is or how to do it.

If you break away from that stereotype, you can open the door to being able to standardize practices within your team in a way that you can control and that works for you and your team.

The Skills Your Team Needs

Data scientists are almost obsessed with the skills they need. This is probably because of the continued ambiguity on what a data scientist actually is. If a data scientist is, as they say, “someone who can code better than a statistician while knowing more about statistics than a coder,” where does the need to learn on those two fronts end?

This notion—sometimes referred to as the unicorn data scientist—often appears to be built on the assumption that a data science team works independently of other teams. Therefore, if they need to provision a database, they will end up doing it themselves. If they need to build a UI, they will end up doing it themselves.

This might be true sometimes for prototyping or developing a proof of concept, but it’s more likely in companies that are larger than the data science team itself that there will be people whose specific task it is to do these things. For example, they might exist under a variety of names, but there is likely to be someone whose job is a descendant of database administrator, and they’ll often have existed in your organization for a long time before the data science function.

Where they do exist, you have an opportunity to move that work outside of the data science team, which simplifies the skills you need to maintain. Don’t worry, there will still be a long list of things that can only be done within the data science team. In fact, it’s exactly because that list is lengthy that you need to be careful to avoid doing things that you don’t have to do.

The knitting you need to focus on is the part of the job that can’t easily be done by other people, or at least needs to be understood within the data science team to ensure the best results.

An example of the first of these is model evaluation—those skills just won’t be found anywhere else apart from in the data science team, so they had better exist there and be performed well.

On the other hand, although an understanding of the business will obviously be found elsewhere, and frequently be better developed than what’s available in the data science, it can’t be outsourced the same way that building an ETL can be outsourced—a sufficient level of business understanding is essential within the data science team.

Therefore, when developing a skills inventory, you need to develop it at two levels. One level is internally for your team and the other is your team in comparison to the rest of the organization.

Also consider that data science job ads are often framed in terms of a list of tools mastered or in terms of particular skills areas. Technical skills are often mastered relatively easily by people with the right way of thinking. Harder to pick up easily are those mindsets themselves.

As an example, you could divide people into “builders” and “analysts”—people who want to work on building data products vs. people who want to analyze data to understand how it applies to a problem. These are very different mindsets. Another different kind of role is a “spanner.”³ Spanners, in some ways, are what many people think of data scientists—they are people who span the gaps between builders and analysts or between data scientists and data engineers. Again, although it is true that to succeed in this role there is a need for skills across multiple areas.

Having a strong understanding of the skills that are available throughout other areas of your organization can take some of the pressure off your data science team generally, and help you to keep your list of desirable skills manageable when you need to hire.

Summary

Data projects need data science teams to complete them, but often data scientists focus on the technical details of their projects and don’t worry as much as they should about how their teams work, or even how well the data science team plays with the rest of the organization it’s a part of.

There is at least some of this human side in Agile. Retrospectives are meant to capture some of the human problems, although they are sometimes considered one of the more difficult aspects of Agile to get right (although you don’t need to be officially Agile to have retrospectives).

While there are many guides to performing retrospectives, a key element of ensuring you successfully discuss the most relevant human problems in your process and discover practical solutions is ensuring that the human aspects are properly discussed. When dealing with a group of people selected for technical skills, this will often require someone to lead the conversation onto the right topics.

Sharing standardized approaches and a standardized vision is also a crucial method of improving both team cohesion and team efficiency. In data science, there is arguably less standardization of training compared to a profession like medicine, so the need to take deliberate steps towards standardization is heightened.

Understanding the skills in other areas of your organizations can help you to keep the skills requirements in your own team manageable. Identify work that doesn’t need to happen within your own department, so you don’t need to constantly maintain those skills and you can simplify your own processes. You can also simplify the types of people that exist within your own area and therefore improve cohesion.

However, standardization doesn’t have to mean the sort of strict processes that are often associated with that word. There are creative ways available to provide guidelines that don’t also impose onerous restrictions on your team members’ creative approaches to their work.

Team Efficiency Checklist

Have you created a process of team retrospectives that considers the right things in terms of sharing the lessons from previous projects?
Have you practiced ways of soliciting extra attention toward the human elements when doing retrospectives?
Does what you decide in your retrospectives carry over to what you do in your normal day?
Have you created standard terms for data science concepts and business concepts for use within your team, and a standardized understanding of priorities?
Have you created a team vision with input from everyone on the team?
Have you assessed the skills required in your data science team, considering the skills that are available elsewhere in your organization to ensure your team members develop the right skill set?