Management of many is the same as management of few. It is a matter of organization.
—Sun Tzu
This chapter discusses the importance of organizational structure and the ways in which it impacts a product’s scalability. We will discuss two key attributes of organizations: size and structure. These two attributes go a long way toward describing how an organization works and where an organization will likely have problems as a group grows or shrinks.
Some of the most important factors that organizational structure can affect are communication, efficiency, standards, quality, and ownership. Let’s take each factor and examine how an organization structure can influence it as well as why that factor is important to scalability. In this way, we can establish a causal relationship between the organization and scalability.
Communication and coordination are essential for any task requiring more than one person. A failure to communicate architectural designs, effects and causes of outages, sources of customer complaints, or reasons for and expected value from changes being promoted to production can all be disastrous. Imagine a team of 50 people with no demarcation of responsibilities or structural hierarchy. The chance that teammates know what everyone is working on is remote. This lack of knowledge is fine until you need to coordinate changes. Without structure, who do you ask for help? How do you coordinate your changes or know with whom you may be in conflict in making a change? These breakdowns in smooth communication, on most days, may cause only minor disruptions, such as having to spam all 50 people to get a question answered. But engineers will, over time, grow wary of the spam and start to ignore most messages. Ignoring messages will, in turn, mean that questions don’t get answered or—even worse—that critical functionality won’t work once merged into the code base. Was that failure the engineer’s fault for not being a superstar, or was it the organizational structure’s fault for making it impossible to communicate and collaborate clearly and effectively?
Organizational efficiency increases when an organizational structure streamlines workflow and decreases when projects get mired down in unnecessary organizational hierarchy where significant increases in communication are necessary to perform work. In the Agile software development methodology, product owners are often seated alongside engineers to ensure that product questions get answered immediately and efficiently. If the engineer arrives at a point in the development of a feature that requires clarification to continue, she has two choices. The engineer could guess which way to proceed or she could ask the product owner and wait for the answer. If she asks the product owner, she must wait for a response. Time waiting may be spent on another project or potentially on non-value-added tasks (like playing games). Substantial waiting time could even cause the team to miss its commitments and unnecessarily carry stories into future sprints, thereby delaying or diminishing the potential return on investment.
Having the product owner’s desk beside the engineer’s desk increases efficiency by getting those questions answered quickly. The alternative to colocation—that is, degraded efficiency vis-à-vis idle time—is a double-edged sword. First, the cost to create some value increases. Second, as your resource pool becomes fully utilized, the company starts favoring short-term customer-facing features at the expense of longer-term scalability projects. Quarterly goals may be met in the short term, but technical debt piles up for a lack of resources and ultimately forces the company, through downtime, to stall new product features.
Organizational efficiency is also driven by the adoption of standards. An organization that does not foster the creation, distribution, and adoption of standards in coding, documentation, specifications, and deployment is sure to suffer from decreased efficiency, reduced quality, and increased risk associated with significant production issues. To see how this behavior evolves, consider an organization that is a complete matrix, where very few engineers (possibly only one) reside on each team along with product managers, project managers, and business owners. Without the adoption of common standards, it would be very simple for teams to significantly diverge in best practices. Standards such as commenting code and the discipline to do so might slip with some teams as they favor greater throughput, but at the expense of future maintainability. To avoid this problem, great organizations help engineers understand the value of established guidelines, principles, and collective norms.
Here’s another example: Imagine your organization has an architectural principle requiring any service to run independently on multiple servers. A team that disregards this standard will both release a solution that will not scale horizontally and have intolerably low levels of availability. Think that doesn’t happen? Think again—we see teams make this mistake all the time. The argument most commonly cited in support of this deviation from standards is that the service isn’t critical to the functioning of the product. Of course, if that’s the case, then why waste time building it in the first place? In this scenario, the team has inefficiently used engineering resources.
As described earlier, an organization that does not foster adherence to norms and standards in essence condones lowering the quality of the product being developed. A brief example of this would be an organization that has a solid unit test framework and process in place but is structured, through either size or team composition, so that it does not foster the acceptance and substantiation of this unit test framework. Perhaps one team might find it too tempting to disregard the parent organization’s request to build unit tests for all features and forgo the exercise. This decision is very likely to lead to poorer-quality code and, therefore, to an increased number of major and minor defects. In turn, the higher defect rate might lead to downtime for the application or cause other availability-related issues. The resulting increase in bugs and production problems will take engineering resources away from coding new features or scalability projects, such as sharding the database. All too often, when resources become scarce in this way, the tradeoff of postponing a short-term customer feature in favor of a long-term scalability project becomes much more difficult.
Ownership also affects the scalability and availability of a product. When many people work on the same code and there is no explicit or implicit ownership of elements of the code base, no one feels as if he or she owns the code. When this occurs, no one takes the extra steps to ensure others are following standards, building the requested functionality, or maintaining the high quality desired in the product. Thus, we see the aforementioned problems with scalability of the application that stem from issues such as less efficient use of the engineering resources, more production issues, and poor communication.
Now that we have a clear basis for caring about the organization from a scalability perspective, it is time to understand the basic determinants of all organizations—size and structure.
Consider a team of two people: The two know each other’s quirks, they always know what each other is working on, and they never forget to communicate with each other. Sounds perfect, right? Now consider that they may not have enough engineering effort to tackle big projects, like scalability projects of splitting databases, in a timely manner; they do not have the flexibility to transfer to another team because each one probably has knowledge that no one else does; and they probably have their own coding standards that are not common among other two-person teams. Obviously, both small teams and large teams have their pros and cons. They key is to balance team size to get the optimal result for your organization.
An important point is that we are looking for the optimal team size for your organization. As this implies, there is not a single magic number that is best for all teams. Many factors should be considered when determining the optimal size for your teams, and the sizes may vary even from team to team within the same organization. If forced to give a direct answer to how many members should optimally be included in a team, we would provide a range and hope that suffices for a specific-enough answer. Although there are always exceptions even to this broad range of choices, our low boundary for team size is 6 and our upper boundary is 15. What we mean by low boundary is that if you have fewer than 6 engineers, there is probably no point in dividing them into separate teams. For the upper boundary, if there are more than 15 people on a single team, the size starts to hinder managers’ ability to actively manage and communication between team members starts to falter. Having given this range, we ask that you recognize there are always exceptions to these guidelines; more importantly, consider the following factors as aligned with your organization, people, and goals. You might recall from Chapter 1 that Amazon calls this upper boundary the “Two-Pizza Rule”: Never have a team larger than can be fed by two pizzas.
The first factor to consider when determining team size is the experience level of the managers. Managerial responsibilities will be discussed later in this chapter as a factor by itself, but for our purposes now we will assume that managers have a base level of responsibility, which includes the three following items: ensuring engineers are productive on value-creating projects, either self-directed or by edict of management; ensuring that administrative tasks, such as allocating compensation or passing along human resources information, are handled; and ensuring that managers stay current on the projects and problems they are running and can pass status information on the same along to upper management.
A junior manager who has just risen from the engineering ranks may find that, even with a small team of six engineers, the administrative and project management tasks consume her entire day. She may have time for little else because these tasks are new to her and they require a significant amount of time and concentration compared with her more senior counterparts. New tasks typically take longer and require more intense concentration than tasks that have been performed over and over again. A person’s experience, therefore, is a key factor in determining the optimal size for a given team.
Tenure of the team is a second factor to consider. Long-tenured and highly experienced teams require less overhead from management and less communication internally to perform their responsibilities. Both experience (time) at the company and experience in engineering are important. Long-tenured employees generally require lower administrative overhead (e.g., signing up for benefits, getting incorrect paychecks straightened out, finding help with certain tasks). Likewise, experienced engineers need less help understanding specifications, designs, standards, frameworks, or technical problems.
Of course, every individual is different and the seniority of the overall team must be considered. If a team has a well-balanced group of senior, midlevel, and junior engineers, they can probably operate effectively in a moderate-sized team. By comparison, a team of all senior engineers, say on an infrastructure project, might be able to accommodate twice as many individuals because they should require much less communication items and be much less distracted with mundane engineering tasks. You should consider all of these issues when deciding on the optimal team size because doing so will provide a good indicator of how large the team can be and remain effective without causing disruption due to the overhead overwhelming the productivity.
As mentioned earlier, each company has different expectations of the tasks that a manager should be responsible for completing. We decided that a base level of managerial responsibilities includes ensuring the following:
• Engineers are productive on value-adding projects.
• Administrative tasks take place.
• Managers are current on the status of projects and problems.
Obviously, many more managerial responsibilities could be assigned to the managers, including one-on-one weekly meetings with engineers, coding of features by managers themselves, reviewing specifications, project management, reviewing designs, coordinating or conducting code reviews, establishing and ensuring adherence to standards, mentoring, praising, and performance reviews. The more of these tasks that must be handled by the individual managers, the smaller the team should be, so as to ensure the managers can accomplish all the assigned tasks. For example, if one-on-one meetings are required—and we believe they should be—an hour-long meeting with each engineer weekly for a team of 10 engineers will consume 25% of a 40-hour week. The numbers can be tweaked—shorter meetings, longer work weeks—but the point remains that just speaking to each engineer on a large team can take up a very large portion of a manager’s time. Speaking frequently with team members is critical to be an effective manager and leader. Obviously, the number of tasks and the level of effort associated with these tasks should be considered as a major contributing factor when determining the optimal team sizes in your organization. An interesting and perhaps enlightening exercise for upper management is to survey the front-line managers and ask how much time they spend on each task for a week. As we indicated with the one-on-one meetings, it is surprisingly easy to fill up a manager’s week with deceptively “quick” tasks.
The previous three factors—experience of management, tenure of team members, and managerial responsibilities—are all constraints. Each limits the size of the team with the intent to reduce overhead and maximize value creation time of the team.
Unlike the first three factors, our final factor—the needs of the business—works to increase team size. Business owners and product managers, in general, want to grow revenue, fend off competitors, and increase their customer bases. To do so, they often need more and increasingly complex functionality. One of the main problems with keeping team sizes small is that large projects require (depending on the product development life-cycle methodology employed) many more iterations or more time in development. The net result is the same: Projects take longer to get delivered to the customer. A second problem is that increasing the number of engineers on staff requires increasing the number of support personnel, including managers. Engineering managers may take offense at being called support personnel, but in reality that is what management should be—something that supports the teams in accomplishing their projects. The larger the teams, the fewer managers per engineer are required.
Obviously, for those familiar with the concepts outlined in The Mythical Man-Month: Essays on Software Engineering by Frederick P. Brooks, Jr., there is a limit to the amount a project can be subdivided to expedite delivery. Even with this consideration, the relationships remain clear: The larger the team size, the faster projects can be delivered and the larger the projects that can be undertaken.
Let’s turn our attention from factors affecting team size to signs that the team size is incorrect. Poor communication, lowered productivity, and poor morale are all potential indicators of a team that has grown too large. Poor communication could take many forms, including engineers missing meetings, unresponsiveness to emails, missed specification changes, or multiple people asking the same questions.
Decreased productivity can be another sign of a team that is too large. If the manager, architects, and senior engineers do not have enough time to spend with the junior engineers, these newest team members will not produce as many features as quickly. Without someone to mentor, guide, direct, and answer questions, the junior engineers will have to flounder longer than they normally would. The opposite scenario might also be the culprit: Senior engineers might be too busy answering questions from too many junior engineers to get their own work done, thereby lowering their productivity. Some signs of decreased productivity include missing release dates, lower function or story points (if measured), and pushback on feature assignment. Function points and story points are two different methods that attempt to standardize the measurement of a piece of functionality. Function points assume the user’s perspective, whereas story points take on the engineer’s perspective. Engineers by nature are typically overly optimistic in terms of what they think they can accomplish; if they are pushing back on an amount of work that they have done in the past, this reluctance might be a clear indicator that they feel their productivity slipping.
Both of the preceding problems—poor communication and decreased productivity due to lack of support—can lead to the third sign of a team being too large: poor morale. When a normally healthy and satisfied team starts demonstrating poor morale, this disgruntlement serves as a clear indicator that something is wrong. Although poor morale may have many causes, the size of the team should not be overlooked. Similar to how you approach debugging, look for what changed last. Did the size of the team grow recently? Poor morale can be demonstrated by a variety of behaviors, such as showing up late for work, spending more time in the game room, arguing more in meetings, and pushing back more than usual on executive-level decisions. The reason for this is straightforward: As an engineer, when you feel unsupported, left out of the communication loop, or unable to succeed in your tasks, that state weighs heavily on you. Most engineers love a challenge, even very tough ones that will take days just to understand the nature of the problem. When an engineer realizes that he cannot solve the problem, he falls into despair. This is especially true of junior engineers, so watch for these behaviors to occur first in the more junior team members.
Conversely, if the team is too small, indicators to look for include disgruntled business partners, micromanaging managers, and overworked team members. In this situation, one of the first signs of trouble might be that business partners, such as product managers or business development, spend more time around the manager complaining that they need more products delivered. A team that is too small is just unable to deliver sizable features quickly. Alternatively, instead of complaining directly to the engineer or technology leadership, disgruntled business leaders might focus their energy in a more positive manner by supporting budget requests for more engineers to be hired.
A trend toward micromanagement from a normally effective manager is a second worrisome sign. Perhaps the manager’s team is too small and she’s keeping busy by hovering over her team members, second-guessing their decisions, and asking for status updates about the request for a status update. If this is the case, it represents a perfect opportunity to assign the manager some other tasks that will serve the organization, help professionally develop the manager by expanding her focus, and give her team some relief from her constant presence. Ideas for special projects that might be assigned in this scenario include chairing a standards committee, leading an investigation of a new software tool for bug tracking, or establishing a cross-team mentoring program for engineers.
The third sign to look for when a team is too small is overworked team members. Most teams are extremely motivated by the products they are working on and believe in the mission of the company. They want to succeed and they want to do everything they can to help. This includes accepting too much work and trying to accomplish it in the expected time frame. If the team members start to leave increasingly later each evening or consistently work on weekends, you might want to investigate whether this particular team includes enough engineers. This type of overworking behavior is expected and even necessary for most startup companies, but working in this manner consistently month after month will eventually burn out the team, leading to attrition, poor morale, and poor quality. It is much better to take notice of the hours and days spent working on tasks and determine a corrective action early, as opposed to waking up to the problem when your most senior engineer walks into your office to resign.
Ignoring the symptoms summarized here can be disastrous. If they go unchecked, it’s almost inevitable that the company will experience unwanted attrition, slowing its ability to scale and deliver the product.
For the teams that are too small, adding engineers, although not necessarily easy, is straightforward. The much more difficult task is to split a team when it has become too large. Splitting a team incorrectly can have dire consequences—for example, confusion over code ownership, deterioration in communication, and increased stress from working for a new manager. Every team and every organization is different, so there is no perfect, standard, one-size-fits-all way of splitting teams. Instead, some factors must be taken into account when undergoing this organizational surgery to minimize the impact and quickly restore the team members to a productive and engaged existence.
Some of the things that you should think about when considering subdividing a team include how to split the code base, who will be the new manager, what level of involvement will individual team members have, and how the relationships with the business partners will change.
The first item to concentrate on is based on the code or the work. As we will discuss in much more detail in Part III, “Architecting Scalable Solutions,” this might be a great opportunity to split the team as well as the code base into failure domains—that is, domains that limit the impact of failures by isolating services from one another.
The code that used to be owned by and assigned to a single team needs to be split between two or more teams. In the case of an engineering team, this division usually revolves around the code. The old team perhaps owned all the services around the administrative part of the application, such as account creation, login, billing, and reporting. Again, there is no standard way of doing this, but one possible solution is to subdivide the services into two or more groups: one group handling account creation and login, the other group handling billing and reporting services. As you get deeper into the code, you will likely hit base classes that require assignment to one team or the other. In these cases, we like to assign general ownership to one team or—even better—to one engineer; alerts can then be set up through the source code repository that inform the other team if anything changes in that particular file or class, keeping everyone aware of changes in their sections of the code.
The next item to consider is the identity of the new manager. This is an opportunity to hire someone new from the outside or to promote someone internally into the position. Each option has both pros and cons. An external hire brings new ideas and experiences, whereas an internal hire provides a manager who is familiar with all the team members as well as the processes. Because of the various advantages and disadvantages associated with each option, this is a decision you do not want to make lightly and might want to ponder for a long time. Making a well-thought-out decision is absolutely the correct thing to do, but taking too much time to identify a manager can cause just as many problems. The stress of the unknown can be dampening to employee morale and cause unrest. Make a timely decision; if that involves bringing in external candidates, do so as openly and quickly as possible. Dragging out the selection process and wavering between an internal candidate and an external candidate simply increases the stress on the team.
The last of the big three items to consider when splitting a team is how the relationship with the business will be affected. If a one-to-one relationship exists between the engineering team, quality assurance, product management team, and business team, it will obviously change when a team is split. A discussion with all the affected leaders should take place before a decision is reached on splitting the team. Perhaps all the counterpart teams will split simultaneously, or perhaps individuals will be reassigned to interact more directly along team lines. There are many possibilities for restructuring, but the most important consideration is that an open discussion should take place beyond the engineering and technology teams.
So far, we have covered the warning signs associated with teams that are too large and too small. We have also addressed the factors to consider when splitting teams. One of the major lessons that should be gleaned from this section is that the team size and changes to it can have tremendous impacts on every aspect of the organization, from morale to productivity. In turn, it is critical to recognize team size’s importance as a major determining factor of how effective the organization is in relation to scalability of the application.
The organizational structure refers to the actual layout or how teams relate to each other within an organization. This includes the separation of employees into departments, divisions, and teams as well as the management hierarchy that is used for command and control of the forces. Although there are as many different structures as there are companies, two basic structures have been in use for years—functional and matrix—and a new structure has recently come into favor—Agile. By understanding the pros and cons of the two time-honored structures as well as the new kid on the block, you will be able to choose one or the other. Alternatively, and perhaps as the more likely scenario, you can create a hybrid that best meets the needs of your company. This section covers the basic definition of each structure, summarizes the benefits and drawbacks of each, and offers some ideas on when to use each one. The most important lesson to be drawn here, however, is how to choose parts of one versus the other structure, and how to plan an evolution from one structure to another as your teams mature.
The functional organizational structure is the original structure upon which armies and industries were based. This structure, as seen in Figure 3.1, separates departments and divisions by their primary purpose or function. This strategy was often called a silo approach because each group of people was separated from other groups just as grain or corn would be separated into silos based on the type or grade of crop. In a technology organization, a functional structure results in the creation of separate departments to house engineering, quality assurance, operations, project management, and so forth. Along with this, there exists a management hierarchy within each department. Each group has a department head, such as the VP of engineering, and a structure within each department that is homogeneous in terms of responsibilities. Reporting to the VP of engineering are other engineering managers such as engineering directors, and reporting to them are engineering senior managers and then engineering managers. This hierarchy is consistent in that engineering managers report to other engineering managers and quality assurance managers report to other quality assurance managers.
The functional or silo organizational structure offers numerous benefits. Managers almost always rise through the ranks; thus, even if they are not good individual performers, they at least know what is entailed in performing the job. Unless there have been major changes in the field over the years, there is very little need to spend time explaining to bosses the more arcane or technical aspects of the job because they are well versed in it. Team members are also consistent in their expertise—that is, engineers work alongside engineers. With this structure, peers who are usually located in the next cube can answer questions related to the technical aspects of the job quickly. The entire structure is built along lines of specificity.
To use an exercise analogy, the functional organizational structure is like a golfer practicing on the driving range. The golfer wants to get better and perform well at golf and therefore surrounds himself with other golfers, and perhaps even a golf instructor, and practices the game of golf, all very specific to his goal. Keep this analogy in mind because we will use it to compare and contrast the functional organizational structure with the matrix structure.
Other benefits of the functional organizational structure, besides the homogeneity and commonality of management and peers, include simplicity of responsibilities, ease of task assignment, and greater adherence to standards. Because the organizational structure is extremely clear, almost anyone—even the newest members—can quickly grasp who is in charge of which team or phase of a project. This simplicity also allows for very easy assignment of tasks. In a waterfall software development methodology, the development phase is clearly the responsibility of the engineering team in a functional organization. Because all software engineers report up to a single head of engineering and all quality assurance engineers report to a single quality assurance head, standards can be established, decreed, agreed upon, and enforced fairly easily. All of these factors explain why the functional organization has for so long been a standard in both the military and industry.
The problems with a functional or silo organization include the lack of a single project owner and poor cross-functional communication. Projects almost never reside strictly within the purview of a single functional team. Most custom software development projects, especially services delivered via the Internet such as Software as a Service (SaaS) offerings, always require tasks to be accomplished by individuals with different skill sets. Even a simple feature request must have a specification drafted by the product owner, design and coding performed by the engineers, testing performed by the quality assurance team, and deployment by the operations engineers. Responsibility for all aspects of the project does not reside with any one person in the management hierarchy until you reach the head of technology, who has responsibility over the product managers, engineering, quality assurance, and operations staffs. This shifting of responsibility can be even further exacerbated when product management doesn’t report through the CTO. Obviously, having the CTO or VP of technology be the lowest-level person responsible for the overall success of the project can be a significant drawback. With this structure, when problems arise in the projects, it is not uncommon for each functional owner to place the blame for delays or cost overruns on other departments.
As simple as the functional organization is to understand, communication can prove surprisingly difficult across departments. As an example, suppose a software engineer wants to communicate to a quality assurance engineer about a specific test that must be performed to check for the proper functionality. The software engineer may spend precious time tracking up and down the quality assurance management hierarchy looking for the manager who is assigning the testing of this feature and then requesting the identity of the person to whom the testing work will be assigned so that the information can be passed along. More likely, the engineer will rely on established processes, which attempt to facilitate the passing along of such information through design and specification documents. As you can imagine, writing a line in a 20-page specification about testing leads to much more burdensome communication than a one-on-one conversation between the development engineer and the testing engineer.
Another challenge with the functional organizational structure is the conflict that arises between teams. When these organizations are tasked with delivering and supporting products or services that require cross-functional collaboration, conflict between teams inevitably arises. “This team didn’t deliver something on time” and “That other team delivered the wrong thing” are common refrains heard when functionally organized teams attempt to work together. Rarely is there an appreciation of each functional team’s challenges or contributions. For an engineer, both self-identity and social identity are tied in part to being seen as belonging to the “tribe” of engineers. Engineers want to belong and be accepted by our peers. Others who are different (quality assurance, product management, or even technical operations) are often seen as outsiders, not trusted, and sometimes the target of open hostility. This conflict between groups of individuals who perceive one another as different is witnessed at one of the most extreme levels through discrimination.
One classic example of the ease with which individuals can turn against others seen as different is the “blue-eyed/brown-eyed” exercise. In 1968, after the assassination of Martin Luther King, Jr., a third-grade schoolteacher in Iowa conducted an exercise in which she divided her class into groups of blue-eyed and brown-eyed students. She made up “scientific” evidence showing that eye color determined intelligence. The teacher told one group that they were superior, gave them extra recess, and had them sit in the front of the room; conversely, she did not allow the other group to drink from the water fountain. The teacher observed that the children deemed “superior” became arrogant, bossy, and unpleasant to their “inferior” classmates. She then reversed her explanation and told the other group that they were really superior. Elliott reported that this new group did taunt the other group similarly to how they were persecuted but were much less nasty in doing so.
The benefits of the functional organization include commonality of managers and peers, simplicity of responsibility, and adherence to standards. The drawbacks include no single project owner and poor communications. Given these pros and cons, the scenarios in which you would want to consider a functional organizational structure are ones in which the advantages of specificity outweigh the problems of overall coordination and ownership. For example, organizations that follow waterfall processes can often benefit from functionally aligned organizations. The functional alignment of the organization neatly coincides with the phase containment inherent to waterfall methods.
In the 1970s, organizational behaviorists and managers began rethinking organizational structure. As discussed previously, although there are certain undeniable benefits to the functional organization, it also has certain drawbacks. In an effort to overcome these disadvantages, companies and even military organizations began experimenting with different organizational structures. The second primary organizational structure that evolved from this work was the matrix structure.
The principal concept in a matrix organization is the two dimensions of the hierarchy. As opposed to a functional organization, in which each team has a single manager and thus each team member reports to a single boss, the matrix includes at least two dimensions of management structure, whereby each team member may have two or more bosses. Each of these two bosses may have different managerial responsibilities—for instance, one (perhaps the team leader) handles administrative tasks and reviews, whereas the other (perhaps the project manager) handles the assignment of tasks and project status. In Figure 3.2, the traditional functional organization is augmented with a project management team on the side.
The right side of the organization in Figure 3.2 looks very similar to a functional structure. The big difference appears on the left side, where the project management organization resides. Notice that the project managers within the Project Management Organization (PMO) are shaded with members of the other teams. Project Manager 1 is shaded light gray along with Engineer 1, Engineer 2, Quality Assurance Engineer 1, Quality Assurance Engineer 2, Product Manager 1, and Product Manager 2. This light gray group of individuals constitutes the project team that is working together in a matrixed fashion. The light gray team project manager might have responsibility for the assignment of tasks and the timeline. In larger and more complex matrix organizations, many members of each team can belong to project teams.
Continuing with the project team responsible for implementing the new billing feature, we can start to realize the benefits of such a structure. The three primary problems with a functional organization are no project ownership, poor cross-team communication, and inherent affective conflict between organizations. In the matrix organization, the project team fixes all of these problems. We now have a first-level manager, Project Manager 1, who owns the billing project. This project team will likely meet weekly or more often, and will certainly have frequent email dialogues, which solves one of the problems facing the functional organization: communication. If the software engineer wants to communicate to the QA engineer that a particular test should be included in the test harness, doing so is as simple as sending an email or mentioning the need at the next team meeting. Thus this structure alleviates the need to trudge through layers of management in search of the right person.
We can now return to the golf analogy that we used in the discussion of the functional organization. Recall that we described a golfer who wants to get better and perform well at golf. To that end, he surrounds himself with other golfers, and perhaps even a golf instructor, and practices the game of golf—all activities very specific to his goal. This is analogous to the functional team where we want to perform a specific function very well, so we surround ourselves with others like us and practice only that skill. According to sports trainers, such specificity is excellent at developing muscle memory and basic skills—but to truly excel, athletes must cross-train. Operating under this concept, the golfer would move away from the golf course periodically and exercise other muscles such as through weight training or running. This kind of cross-training is similar to the matrix organization in that it doesn’t replace the basic training of golf or engineering, but rather enhances it by layering it with another discipline such as running or project management.
For those astute individuals who have cross-trained in the past, you might ask, “Can cross-training actually hinder the athlete’s performance?” In fact, if you are a golfer, you may have heard such talk around not playing softball because it can wreak havoc with your golf swing. We will discuss this concept shortly, in the context of the drawbacks of matrix organizations.
If we have solved or at least dramatically improved the drawbacks of the functional organizational structure through the implementation of the matrix, surely there is a cost for this improvement. In fact, while mitigating the problems of project ownership and communication, we introduce other problems involving multiple bosses and distraction from a person’s primary discipline. Reporting to two or more people—yes, matrix structures can get complex enough to require a person to participate on multiple teams—invariably causes stressors because of differences in direction given by each boss. The engineer trapped between her engineering manager telling her to code to a standard and her project manager insisting that she finish the project on time faces a dilemma fraught with stress and the prospect of someone not being pleased by her performance. Additionally, the project team requires overhead, as does any team, in the form of meetings and email communications. This overhead does not replace the team meetings that the engineer must attend for her engineering manager, but rather takes more time away from her primary responsibility of coding.
As you can see, while solving some problems, the matrix organizational structure introduces new ones. This really should not be too shocking because that is typically what happens—rarely are we able to solve a problem without triggering consequences of another variety. The next question, given the cons of both the matrix organization and the functional organization, is clear: “Is there a better way?” We believe the answer is “yes” and it comes in the form of an “Agile Organization.”
Developing unique and custom SaaS solutions or enterprise software is a complex process requiring cross-functional collaboration between multiple skill sets. Within functionally organized teams delivering SaaS products, conflict and communication issues are inevitable. While the matrix organizational structure solved some of these issues by creating teams with diverse skill sets, it created other problems, such as individual contributors reporting to multiple managers who often have differing priorities. The unique requirements of SaaS, the advent of Agile development methodologies, and the drawbacks of both functional and matrixed organizations has led to the development of a new organizational structure that we call the Agile Organization.
In February 2001, 17 software developers, representing practitioners in various document-driven software development methodologies, met in Snowbird, Utah, to discuss a lightweight methodology.3 What emerged was the Manifesto for Agile Software Development, which was signed by all of the participants. This manifesto, contrary to some opinions, is not anti-methodology, but rather a set of 12 principles that attempt to restore credibility to the term methodology by outlining how building software should be intensely focused on satisfying the customer. One of the principles focused on how teams should organize: “The best architectures, requirements, and designs emerge from self-organizing teams.” This idea of an autonomous, self-organizing team opened people’s minds to the possibility of a new organizational structure that wasn’t role based, but rather focused on satisfying the customer.
3. Kent Beck et al. “Manifesto for Agile Software Development.” Agile Alliance, 2001. Retrieved June 11, 2014.
Centralized hosting of business applications is not something new. Indeed, it dates back to the 1960s, with time-sharing on mainframes. Fast-forward to the 1990s and the rapid expansion of the Internet, and we find entrepreneurs marketing themselves as application service providers (ASPs). These companies hosted and managed applications for other companies, with each customer getting its own instance of the application. The reputed value was reduced costs for the customers due to the ASP being extremely skilled at hosting and managing a particular application. By the early 2000s, another shift had come about—the emergence of Software as a Service. Supposedly this term first appeared in an article called “Strategic Backgrounder: Software as a Service,” internally published in February 2001 by the Software & Information Industry Association’s (SIIA) eBusiness Division.4 Like most things in the technology world, the definition of SaaS can be debated, but most agree that it includes a subscription-pricing model whereby customers pay based on their usage rather than a negotiated licensing fee, and that the architectures of these applications are usually multitenant, meaning that multiple customers use the same instance of the software.
4. Strategic Backgrounder: Software as a Service. Washington, DC: Software & Information Industry Association, February 28, 2001. http://www.slideshare.net/Shelly38/software-as-a-service-strategic-backgrounder. Accessed April 21, 2015.
With this shift toward providing a service rather than a piece of software, technologists began thinking about being service providers rather than software developers. Along with this evolving mindset came other ideas about the expected quality and reliability of delivery of these services. Traditionally, when we think of a service, we conjure up ideas of household services such as water, sanitation, and electricity. With such services, we have very high expectations of their quality and dependability. When we turn a faucet, we expect clear, potable drinking water to spew forth every time. When we flip a light switch, we expect electricity to flow, with very little variation in current at our disposal. Why shouldn’t we expect the same from software services? As customer expectations related to SaaS increased, technology companies began to react by attempting to offer more reliable services. Unfortunately, their traditional organizations kept getting in the way of meeting this standard. What resulted was even greater conflict between functional teams and slower delivery.
The last piece of the puzzle that allowed the Agile Organization to be conceptualized was the realization that the organizational structure of a technology team matters greatly to the quality, scalability, and reliability of the software. The authors of this book arrived at this conclusion only after several years of scalability engagements with clients. As technologists, when we started consulting, we were certain that our efforts would be focused on technology and architecture. After all, couldn’t every technology problem be solved through technology? Yet, in engagement after engagement, the conversation kept returning to organizational issues such as conflict between teams or individuals reporting to multiple managers and not understanding their priorities. Eventually, we remembered that people develop technology and, therefore, people are important to the process. Thus began our journey to understanding that a truly scalable system requires the alignment of architecture, organization, and process. The culmination of this epiphany was the first version of The Art of Scalability, published in 2009. We certainly don’t claim to have been the first to recognize the importance of organizations, nor do we mean to suggest that this book was the most influential in propagating this message. However, given that both the authors had previously held executive-level operating roles at high-tech companies, our lack of understanding of this key point serves as an indicator of what the technology community in general thought during the early to mid-2000s.
The result of these three factors was technology companies testing various permutations of organizational structures in an attempt to improve the quality and reliability of the software services they were offering. As with the functional and matrix organizations, many variations of this new organization structure are possible. For simplicity, we label any organization that is cross-functional and aligned to the architecture of the services that are provided as an Agile Organization. The Agile Organization, as shown in Figure 3.3, creates teams that are completely autonomous and self-contained. These teams own a service throughout the entire life cycle—from idea inception, to development, to support of the service in production. Directors or VPs of cross-functional, Agile teams have replaced the typical managerial roles such as VP of Engineering.
In Part III, “Architecting Scalable Solutions,” we will discuss the concepts of splitting a system into small services that together make up the larger system. For now, a simple example of an ecommerce system that can be split into user services such as search, browse, and checkout functionality will suffice. In this scenario, a company that is utilizing an Agile Organization structure would have three different Agile teams, each responsible for one of the services. These teams would have all of the personnel necessary to manage, develop, test, deploy, and support this service assigned to the team on a full-time basis. This example scenario is depicted in Figure 3.4, where each Agile team is aligned to a user service and includes all the personnel skill sets required.
Before we dive into the practical application of the Agile Organization, we will explore the academic theory underlying why the Agile Organization works for decreasing affective conflict and, therefore, improving team performance. First, we need to understand how we would measure a team’s performance. As a practical measurement, we would like to see the team increase the quality and availability of the services that it provides. In academic research, we can use the term innovation to represent the value-added output of a team. Innovation has been identified as a criterion that encompasses effective performance. The question that many of us have asked for years is this: “Which factors help teams increase their innovation?” Through extensive qualitative and quantitative primary research, triangulated with extant research, we now know some of the factors that drive innovation. In the discussion that follows, we will explore each of these factors in turn.
As discussed previously, conflict can be either productive or destructive, depending on the type. Cognitive conflict brings diverse perspectives and experiences together. Brainstorming sessions are often seen as examples of cognitive conflict, whereby teams attempt to gather a set of alternatives superior to what the team members could arrive at in isolation. It’s likely that each of us has participated in at least one brainstorming session that was incredibly productive. The session probably started with a leader setting the agenda, making sure everyone knew each other, establishing some ground rules around respect and time limits, and so on. What ensued might have been a 60- to 90-minute session in which people built upon each other’s ideas. Not everyone agreed, but ideas were exchanged in a respective manner such that roadblocks to solutions were raised and ways around them were discussed. This collaboration resulted in creative and innovative ideas that likely no one would have generated on their own. Everyone probably left the meeting feeling glad they had participated and that the time investment was well worth it. We also probably wished that all of our other meetings that week went as smoothly.
The opposite of cognitive conflict is affective conflict. This so-called bad or destructive conflict is role based and revolves around the questions of “who” or “how” a task should be done. Affective conflict places physical and emotional stress on team members. Whereas cognitive conflict can increase a team’s innovation, affective conflict decreases a team’s innovation.
Affective and cognitive conflicts are not the only factors that influence innovation. If we think back to our recollected brainstorming session and mentally look around the meeting room, the participants probably represented diverse backgrounds. Perhaps one person was from engineering, while another individual came from product management. Some of the individuals were only a few years out of college, while others had been in the workforce for decades. This experiential diversity can increase both affective and cognitive conflict. Sometimes people with different backgrounds tend to butt heads because they approach problems or opportunities from such different perspectives. If this were always the case, we would staff teams with people of similar backgrounds—but alas, dynamics between people aren’t that simple. Diversity of experience also can promote diversity of thoughts, leading to ideas and solutions that go far beyond what a single person could ever achieve. Thus experiential diversity increases both affective and cognitive conflict. The key for us as leaders attempting to increase our teams’ innovation is to minimize how experiential diversity impacts the affective components and to maximize how it impacts the cognitive aspects.
Another type of diversity that is important in terms of influencing innovation is network diversity, which is a measure to what extent individuals on a team have different personal or professional networks. Network diversity becomes important with regard to innovation because almost all projects run into roadblocks. Teams with diverse networks are better able to identify these potential roadblocks or problems early in the project. We have all probably been on a team where one individual, who came from a different background than the rest and still had friends from that past, was able to provide clear advice on potential problems. Perhaps you were working on an IT project to implement a new software package in a manufacturing plant. One of your team members who had been a summer intern on the manufacturing floor was able to give the team a heads-up that one of the manufacturing lines could be shut down to install the new software only on certain days of the week. This type of network diversity on a team brings greater knowledge to the project. When a team actually does encounter roadblocks, those teams with the most diverse networks are better prepared to find resources outside of their teams to circumvent those obstacles. Such resources might take the form of upper-level managers who can provide support or even additional QA engineers who can help test the software.
Another factor that is widely credited with increasing innovation is the team’s sense of empowerment. If a team feels empowered to achieve a goal, it is much more likely to achieve it. An interesting counterexample can be seen in many military selection courses, where decreasing empowerment is intended to decrease a candidate’s motivation to complete tasks. Decreasing a team’s or individual’s sense of empowerment to accomplish a goal tests one’s mettle or fortitude. One technique for achieving this aim is to move the goal line.
Imagine yourself in one of the competitive military selection courses, such as officer candidate school, which selects and trains enlisted soldiers to become commissioned officers to lead soldiers on the battlefield. To get to this level, you must have first established yourself as a stellar enlisted soldier, achieving the highest marks on your reviews. You must have also competed in a variety of psychological, physical, and mental tests over the course of months or even years. You’re mentally sharp and physically fit; you’re confident that you can overcome any obstacle and accomplish any task put in front of you.
This morning you wake up before the sun comes up, don your physical training uniform, lace up your shoes, and head out for formation. On the agenda is a physical test of your running endurance. Being a pretty decent runner, you’re thinking this test should be a breeze. The wrench that the instructors throw at you is not telling you where the finish line is located. You run out a couple of miles and turn around, heading back to the starting line. Most people would think that the finish line would be where the start line was. However, upon arriving back to the start line, you turn around and head back out on a different path. After a few more miles, you again turn around and head back to the start line. Surely, this will be the finish line. Not so fast: As the instructors approach the start line, they turn and head back out on a yet another path.
It’s at this point that the candidates begin to break and throw in the towel. Not knowing where the finish line or goal is, people become disheartened and start doubting their ability. The opposite of this is, of course, true as well: When individuals or teams believe they are empowered with the resources they need to accomplish something, their innovation increases. Returning to our officer candidate school example, if the soldiers believe they are empowered by being in the proper physical shape, outfitted in the proper running attire, and armed with a clear understanding of their goals, they are much more likely to achieve them.
There is one more factor that we need to understand in our model of innovation—a factor that deals directly with the organizational structure. This factor is organizational boundaries, which simply means individuals are on different teams. Between all teams are boundaries that separate them. Some boundaries are narrow, such as those between similar teams, while others are quite large, such as those found between very different teams (e.g., product managers and system administrators). In our model of innovation, organizational boundaries, across which collaboration must happen, increase affective conflict. Therefore, the more organizational boundaries that a team must cross to coordinate with others for the accomplishment of a goal, the less innovation that the team will demonstrate. We touched upon the reason for this outcome earlier. As engineers, our self-identity is tied in part to being seen as belonging to the “tribe” of engineers. We want to belong and be accepted by our peers. Others who are different (quality assurance, product management, or even technical operations personnel) are often seen as outsiders and people who should not be trusted. It has been hypothesized that survival strategies may constitute a homo homini lupo5 situation in which outsiders are distrusted as hostile competitors for scarce resources. Distrust toward outsiders forces individuals into rigid in-group discipline.6 This sort of emotional aloofness and distrust of outsiders has been observed in many groups.7
5. “Man is a wolf to [his fellow] man.”
6. Christian Welzel, Ronald Inglehart, and Hans-Dieter Klingemann. The Theory of Human Development: A Cross-Cultural Analysis. Irvine, CA: University of California–Irvine, Center for the Study of Democracy, 2002. http://escholarship.org/uc/item/47j4m34g. Accessed June 24, 2014.
7. Peter M. Gardner. “Symmetric Respect and Memorate Knowledge: The Structure and Ecology of Individualistic Culture.” Southwestern Journal of Anthropology 1966;22:389–415.
This brings us to our final theoretical model for team innovation. Figure 3.5 depicts the complete model, in which the factors of network diversity, sense of empowerment, and cognitive conflict increase innovation. Affective conflict decreases innovation. Experiential diversity increases both affective and cognitive conflict, while organizational boundaries increase just the affective conflict. Now, armed with this fully detailed model, we can clearly articulate why the functional and matrix organizational structures decrease innovation whereas the Agile Organization increases it.
In a functional organization, individuals are organized by their skill or specialty. Almost all projects require coordination across teams. This is especially true of SaaS offerings, where the responsibilities of not only developing and testing the software but also hosting and supporting it fall on the company’s technology team. With shrink-wrapped software or software that the customer installs and supports, part of this responsibility is shared with the customer. In today’s more popular SaaS model, the entirety of this responsibility belongs to companies’ technology teams. As shown in Figure 3.6, this causes affective conflict between teams. Organizational boundaries increase the affective conflict, resulting in a decrease in innovation.
In matrix organizations, where individuals have multiple managers, often with different priorities, we see the “moving goal line” problem. Attempting to please two masters often leads to unclear goals and teams with a reduced sense of empowerment.
Agile Organizations have neither of these problems. They break down the organizational boundaries that functional organizations struggle with, and they empower teams, eliminating the problem that matrix organizations face.
The primary benefit of an Agile Organization is the increased innovation produced by the team. In a SaaS product offering, this increased innovation is often measured in terms of faster time to market with features, greater quality of the product, and higher availability. Such benefits are often realized by teams that embrace the Agile Organization structure. As expected, the increased innovation is driven by the improvement in factors such as conflict, empowerment, and organizational boundaries.
When teams align themselves according to services, are autonomous, and have cross-functional composition, the result is a significant decrease in affective conflict. The team members have shared goals and no longer need to argue about who is responsible or who should perform certain tasks. The team wins or loses together. Everyone on the team is responsible for ensuring the service they provide meets the business goals, which include high-quality, highly available functionality.
While the Agile Organization does a great job at improving a team’s innovation, there are downsides to this organizational structure. We’ll discuss these drawbacks next.
The primary complaint that we hear about an Agile Organization doesn’t come from engineers, product managers, or any team members; rather, it is voiced by management. When you turn an organizational chart on its side, you lose some of the traditional management roles, such as “VP of Engineering.” Of course, you can still have a VP of Engineering position but in practice that person either will be the engineering chapter leader or will lead cross-functional Agile teams. This elimination of the upper-level management role sometimes bothers managers or directors who are used to reporting to—or want to be—this position.
The other con with the Agile Organization is that for it to work as intended, teams need to be aligned to the user-facing services in the architecture. When Agile teams overlap in terms of ownership or responsibility of the code base, the teams cannot act autonomously. This results in teams that are less innovative as measured by the delivery of high-quality functionality and highly available services. Of course, this does not mean you cannot have Agile teams that provide common core services used by other teams. Often these teams work like an open source model, providing libraries or services to other Agile teams.
One outcome people often believe is a disadvantage of the Agile Organization is having software developers, DevOps, QA, and possibly even product managers on call for production issues with services. In our opinion, this is actually a benefit of the organizational structure, because it provides a feedback loop to the team. If the team members are constantly being woken up at 2 a.m. to fix a production issue, they will quickly learn that producing high-quality services that are scalable leads to a better night’s sleep. While these lessons might be painful in the short term, they are game changers in the long run in terms of improving a team’s performance.
While no organization structure is perfect, we believe that the Agile Organization is an ideal choice for many companies struggling with poor service delivery, high amounts of conflict, unmotivated employees, and a general lack of innovation.
In this chapter, we highlighted the factors that an organizational structure can influence and showed how they are also key factors in application or Web services scalability. We established a link between the organizational structure and scalability to point out that, just like hiring the right people and getting them in the right roles, building a supporting organizational structure around them is important. We discussed the two determinants of an organization: team size and team structure.
In regard to the team size, size truly matters: A too-small team cannot accomplish enough; a too-large team has lower productivity and poorer morale. Four factors—management experience, team member tenure in the company and in the engineering field, managerial duties, and the needs of the business—must be taken into consideration when determining the optimal team size for your organization. A variety of warning signs should be monitored to determine if your teams are too large or too small. When teams are too large, poor communication, lowered productivity, and poor morale may emerge as symptoms. When teams are too small, disgruntled business partners, micromanaging managers, and overworked team members may be apparent. Growing teams is relatively straightforward, but splitting up teams into smaller teams entails much more. When splitting teams, topics to be considered include how to split the code base, who will be the new manager, which level of involvement individual team members will have, and how the relationship with the business partners will change.
The three team structures discussed in this chapter were functional, matrix, and Agile Organization. The functional structure—the original organizational structure—essentially divides employees based on their primary function, such as engineering or quality assurance. The benefits of a functional structure include homogeneity of management and peers, simplicity of responsibilities, ease of task assignment, and greater adherence to standards. The drawbacks of the functional structure are the lack of a single project owner and poor cross-functional communication. The matrix structure starts out much like the functional structure but adds a second dimension, consisting of a new management structure. It normally includes project managers as the secondary dimension. The strengths of the matrix organization are its resolution of the project ownership and communication problems; its weaknesses include the presence of multiple bosses and distraction from a person’s primary discipline. Finally, the Agile Organization improves teams’ innovation as measured by time to market, quality of features, and availability of services.
• Organizational structure can either hinder or help a team’s ability to produce and support scalable applications.
• Team size and team structure are the two key attributes with regard to organizations.
• Teams that are too small do not provide enough capacity to accomplish the priorities of the business.
• Teams that are too large can cause a loss of productivity and degrade morale.
• The two traditional organizational structures are functional and matrix.
• Functional organizational structures provide benefits such as commonality of management and peers, simplicity of responsibilities, ease of task assignment, and greater adherence to standards.
• Matrix organizational structures provide benefits such as project ownership and improved cross-team communication.
• Agile Organization structures, especially those aligned to services and architecture, provide increased innovation as measured by faster time to market, higher-quality functionality, and higher availability of services.