Appendix A:
MI-CIS-OIT-003

MEMO

Office of Information Technology

Management Instruction for Applying Lean- Agile-DevOps Principles at USCIS

Effective Date: 1 April 2017 Management Instruction: CIS-OIT-003

I. Purpose

This Management Instruction (MI) establishes the United States Citizenship and Immigration Services (USCIS) policies, procedures, requirements, and responsibilities for the use of Lean Thinking, Agile Development, and DevOps capability. It supersedes MI CIS-OIT-001 (Agile Development) and MI CIS- OIT-002 (Team-Managed Deployment Onboarding) and should be considered the current guidance for delivering Information Technology (IT) solutions within USCIS.

Lean, Agile, and DevOps methods enable the delivery of fit-for-purpose IT solutions with very short lead times, as measured from identification of a mission need to the delivery of IT capabilities meeting that need. These methods have been shown to produce IT solutions that:

• Satisfy customers

• Maintain ongoing operational capabilities

• Are high quality, thoroughly tested, and technically excellent

• Rapidly adapt even in an uncertain operating environment

• Continuously improve time-to-mission-value

This MI increases emphasis on DevOps thinking to improve USCIS IT service delivery agility and increase the business value of IT projects. DevOps strategies should be used to deploy software more frequently and reliably, act faster on feedback from system operations, and establish a culture of continuous experimentation and learning. These methods have been shown to:

• Enhance quality, reliability, and security of products and services over the long term

• Decrease business risk by lowering change failure rates and system downtime

• Improve outcomes and experiences for system stakeholders, developers, operations engineers, and end users

• Reduce total investment costs

Lean, Agile, and DevOps methods are consistent with the Department of Homeland Security (DHS) Acquisition Management Directive (MD) 102 and the DHS Systems System Engineering Lifecycle (SELC), the Digital Services Playbook, “Modular First” guidance from the DHS Chief Information Officer (CIO), the Federal Chief Information Officer’s 25 Point Implementation Plan, and the Office of Management and Budget (OMB) Modular Contracting Guidance. The “modular and incremental” approach encouraged in these documents mandates that the government continuously learn and improve at delivering low cost, low risk IT solutions. In order to monitor these outcomes, this MI includes governance designed to provide rich, ongoing visibility into USCIS system development, delivery, and operations.

II. Scope

This MI applies to all employee and contractor teams involved in the planning, development, and deployment of software and systems throughout USCIS.

III. Authorities

The following laws, regulations, orders, policies, directives, and guidance authorize and govern this Management Instruction:

1. DHS MD 102-01 Acquisition Management Directive, and associated Instructions and Guidebooks

2. Section 5202 of the Clinger-Cohen Act of 1996

3. OMB Circulars A-130 and A-11

4. 25 Point Implementation Plan to Reform Federal Information Technology Management (U.S. Chief Information Officer, December 9, 2010)

5. Contracting Guidance to Support Modular Development (OMB, June 14, 2012)

6. Memorandum on Agile Development Framework for DHS, by DHS CIO, Richard A. Spires, issued June 1, 2012

7. Digital Services Playbook (https://playbook.cio.gov)

IV. Policy, Procedures, and Requirements

Except in cases where a waiver is granted by the USCIS CIO, all systems development and maintenance projects at USCIS will follow this Lean-Agile-DevOps MI. Such projects include custom software development, Commercial Off-The-Shelf-Software (COTS) integration and configuration, business intelligence, and reporting capabilities. Where appropriate, Lean-Agile-DevOps approaches may be used for other IT and non-IT projects. For the purposes of this MI, projects will be considered in compliance if they achieve the outcomes specified in Sections A and B. To achieve these outcomes, teams and programs may elect to use practices from the set of Generally Accepted Agile Practices listed in the Appendix and work with Independent Validation & Verification (IV&V) teams to ensure that they fulfill the MI CIS-OIT-004 (Agile Independent Verification and Validation).

USCIS Office of Information Technology (OIT) management will ensure that appropriate training, coaching, and tools are available to facilitate the success of all projects. Teams are encouraged to work with OIT support groups to implement this MI in a manner appropriate for their particular context.

A. Lean-Agile and DevOps Approaches Defined

Lean can be characterized as “the art of maximizing work not done” by increasing flow and reducing waste. Leanness is measured for IT projects by the lead time from identification of a need to the time a corresponding capability is delivered. Waste is defined as work that does not add enough value to justify itself, such as handoffs, delays, and unnecessary intermediate work products. Lean IT projects at USCIS continuously improve efficiency and responsiveness to mission needs on behalf of the public.

Agile approaches use an iterative, incremental, and collaborative process to deliver small, frequent software releases. Effective agile methods yield rich information from tight feedback loops, providing customers and delivery team’s frequent opportunities to adapt based on changing project conditions. A number of agile methods are in common use at USCIS, including Kanban, Scrum, and Extreme Programming (XP). The values common to agile practices are articulated in the Agile Manifesto, which elevates interaction, working software, customer collaboration, and responding to change. The intent of the agile values is not to prescribe a set of mechanical steps or ceremonies but to guide an empirical, feedback-oriented agile mindset. Teams that follow agile values are likely to benefit from the “guardrails” inherent in the agile approach. Teams are encouraged to use practices from one or more agile methods as appropriate and to incorporate innovations from the agile community.

DevOps approaches subscribe to a seamless collaboration of operations and development engineers to fulfill business needs through delivery of stable, secure, and reliable services to customers. DevOps methods yield timely feedback at all points in the service lifecycle, improving the ability to reliably deploy software, respond to feedback from production operations, and continuously improve quality. A number of DevOps strategies are commonly used at USCIS, including Continuous Integration, Continuous Deployment and Continuous Operations.

B. Required Outcomes

USCIS develops IT solutions to support the mission of the agency. In order to achieve the desired impact, we require certain outcomes from software development, deployment, and operations processes. Where required outcomes are difficult to measure directly, measurement and observation can be used to infer them. These observations are guided by asking key questions to assess whether the desired outcome is being achieved.

The key questions presented for each outcome in this MI are not a definitive list. Programs should determine effective outcome measurements in their own context and track the trends of those measurements. Programs are also expected to change the questions and measurements over time to ensure they are checking the most important concerns. In addition to self-assessment, programs should coordinate with USCIS Independent Verification and Validation (IV&V) to assess effectiveness, facilitate transparency and accountability, and provide feedback to teams and management from an independent viewpoint.

Outcome #1: Programs and projects frequently deliver valuable product

Earlier delivery allows earlier accrual of value. Earlier use provides feedback on suitability.

Key Questions

• How frequently is the working system delivered to stakeholders for review?

• How frequently is the working functionality delivered to end users for use?

• What is the cycle time (mean, distribution) from start of work on a feature to delivery?

• What is the lead time from ideation/approval to use?

• How do you verify that the systems you’re developing are solving the intended problem?

• How quickly do you know?

Outcome #2: Value is continuously discovered and aligned to mission

Teams and their business partners continuously discover emerging needs for their products. Delivered capability can and should trigger new discoveries.

Key Questions

• What business outcomes or strategic objectives are supported by the work being done?

• How do you know that you’re working on today’s highest priority items?

• What is the customer (stakeholders, users) satisfaction with delivered functionality?

• What actionable insights from end users are addressed over time?

• What is the team satisfaction with business engagement and direction?

• How can the value delivered be measured (understanding that sometimes a quantitative measure is not appropriate or feasible)?

Outcome #3: Work flows in small batches and is validated

Batch deployments significantly reduce risks associated with deployment. Low risk deployments promote flow of new capabilities to production.

Key Questions

• How is daily progress toward goals made visible? Is it a reliable progress indicator or does it hide surprises?

• Is work in progress finished before new work is started?

• How completely is incremental work validated before it is considered done?

Outcome #4: Quality is built in

Work processes address quality as a matter of course rather than as remediation. Avoiding problems provides more benefit than solving problems.

Key Questions

• What is the demand for remedial work?

• What is the incident rate of escaped defects?

• Are your tests automated and structured to provide the quickest feedback (unit tests)?

• Are you testing at all layers of the application with appropriate investment (test pyramid)?

• How easily does the system architecture and design allow for modification and extension?

• What precautions have been taken to reduce consequences when there is a system failure?

• What measures are in place to monitor the intrinsic quality of the code?

• How frequently do commits fail in the build/test/deploy pipeline?

• Does the system meet appropriate performance thresholds? As the system is modified, what are the trends in performance measurements?

Outcome #5: The organization continuously learns and improves

Improvements come from increased knowledge and skill. Performing the work provides deeper insights into improved methods.

Key Questions

• How freely can teams innovate and improve daily work?

• How inclusive is the collection of improvement ideas?

• How safe is it to try experiments that may not lead to expected results?

Outcome #6: Teams collaborate across groups and roles to improve flow and remove delays

The desired result is more than the sum of individual roles. Overlap is needed to prevent gaps between business and technical roles. Handoffs result in lost information and delays. Much important knowledge is tacit, and can best be shared by working together.

Key Questions

• How much code is reused across teams?

• What diverse roles explore the details of the requirements and what are the indicators of satisfaction of the requirements?

• What indications are there of responsibilities being shared across groups? How much time is spent waiting for another team’s work?

Outcome #7: Security, accessibility and other compliance constraints are embedded and verifiable

Systems must not only have to work correctly for intended use, but also resist unintended abuse. In addition, there are mandates in law and executive direction that must be followed. Notable among these are disability accessibility and privacy protection. There are also constraints about the language used to communicate with the public.

Key Questions

• How are security, accessibility, and organizational constraints communicated throughout the project community? What indications are there that these constraints are well understood?

• Are security, accessibility and privacy requirements treated the same as functional requirements? How are they addressed in the requirements process? Are they prioritized as highly?

• How are security, accessibility and privacy addressed in system design and code structure?

• To what extent are security testing and controls integrated into daily work? How much is automated? How early does it detect issues?

• How is compliance with all security requirements verified in an ongoing manner and documented with auditable evidence?

• How is Section 508 Compliance verified as the system is developed and documented with auditable evidence? What controls are in place to notice undesirable changes or other actions made by an individual? How do we confirm and provide evidence that controls are operating effectively?

Outcome #8: Consistent and repeatable processes are used across build, deploy, and test

Consistency is required to maintain quality across delivery. Teams who have an understood and repeatable process can gauge the efficacy of the improvements made.

Key Questions

• How many manual steps are there in the current build, deploy, and test process, and what is the team doing to reduce that number?

• Is there a common code repository/branch that is built, tested, and deployed on every commit?

• How long does code exist on other branches or the developer’s machine before merging to the common code?

• What degree of confidence does the suite of automated tests provide? o How quickly and easily can build failures be resolved?

Outcome #9: The entire system is deployable at any time to any environment

Unfinished work in progress provides no benefit and may block the efforts of others. The system should be maintained in a working state even as modifications are being made.

Key Questions

• Can the same automated script deploy to every environment?

• Are database changes and rollbacks automated with version-controlled scripts?

• To what extent is the setup and configuration of environments automated with version-controlled scripts?

• To what extent is the build/deployment pipeline automated with version-controlled scripts?

• How long does it take to stand up a complete test environment with production or production-like data?

Outcome #10: The system has high reliability, availability, and serviceability

Attention must be focused on the robustness of the system in the face of errors, the ability to be used as development proceeds, and the ability to quickly detect and correct latent problems.

Key Questions

• Can various parts of the system be built and deployed independently?

• To what degree is the system meeting the reliability, availability, and serviceability needs of the mission?

• How long does it take to detect, ameliorate, and correct operational problems?

• Are the operational characteristics of the system being validated in production through monitoring, reporting, and alerting? How?

• Is the system designed in such a way as to be cost-effective in operation?

V. Generally Accepted Agency Practices

Each program or project chooses a baseline set of practices that support the Lean-Agile-DevOps outcomes listed in this document. The chosen practices should be documented in the Team Process Agreement (TPA) and improved over time. The program or project may solicit an independent assessment of its practices following the USCIS IV&V Policy and will be expected to justify its practices to the RPR Authority (USCIS CIO or designee). Improvements that are material should be documented by updating the Team Process Agreement (TPA) before the next RPR and the program or project should be able to justify its TPA practices if questioned about them in the RPR.

MI CIS-OIT-003 Appendix A lists typical agency practices, derived from Agile and DevOps methods commonly followed in the software development industry. Nothing in this document should be construed as prohibiting even better practice, but is intended to guard against insufficient discipline or governance. Practices may be reviewed by the CIO or designee at any time, particularly in the RPR, or on the advice of Quality Assurance, and the program or project should be able to justify the chosen practices.

VI. Governance

The purpose of this governance is to ensure the government’s interest in delivering appropriate IT solutions on behalf of the public. Governance responsibilities include:

• Ensuring changes to IT systems pose appropriately low risk to mission fulfillment

• Portfolio management and alignment with the overall strategic direction of mission

• Providing transparency to project stakeholders and opportunities for involvement by those impacted by system changes, including end users

• Verifying projects are carried out with appropriate procurement, contracting, and hiring practices in order to meet fiduciary constraints

• Continuously improving governance and oversight mechanisms to ensure that they accomplish project goals in a lean manner

This MI represents a tailoring of the SELC included in the annexes to DHS acquisition guidance presented in DHS D-102. Appendix B provides the tailoring plan that demonstrates this alignment with D-102. By following this MI, USCIS Lean-Agile-DevOps projects will maintain compliance with D-102. Programs on the DHS Major Acquisitions Oversight List will, in addition, need to fulfill the requirements of the D-102 Acquisition Lifecycle Framework (ALF).

A separate MI, CIS-OIT-004, describes the USCIS Agile Independent Verification and Validation USCIS Independent Verification and Validation (IV&V) approach that will be used to evaluate adherence to this MI and will inform governance activities.

During the “Obtain” phase of the program acquisition lifecycle, system development activities will proceed through a number of increments, or release cycles. For each increment, the following gate reviews will be held:

Lean Release Planning Review (Lean RPR)

Lean Release Planning is the means by which USCIS agile projects establish time, cost, and a notional plan for delivering new capabilities. Lean Release Planning artifacts should include minimum documentation necessary to effectively communicate release plans and should be published in a location accessible to all stakeholders. Once a minimum set of artifacts is established at the outset of a project, artifacts should be updated incrementally throughout the release cycle to reflect current reality.

The RPR Meeting is a gathering of stakeholders to review release plans and align resources to support them. The RPR Authority (USCIS CIO or designee) will assess a project’s readiness to proceed with a time boxed release cycle of no more than six months. A business decision will be made as to whether the investment in the release cycle is justified by the expected results (capabilities to be produced). The project may not proceed to release activities (development, testing, etc.) until it has secured RPR approval.

The RPR Authority will assess the project’s likelihood of achieving the outcomes required by this MI (sections A and B). The RPR Authority will review appropriateness of resourcing and skill levels, agile team processes, technical practices, the team’s understanding of capabilities to be developed, oversight and transparency mechanisms, and dependencies on other projects and infrastructure. The project will demonstrate its readiness through thoughtful discussion with the RPR Authority, by providing evidence that stakeholders and delivery team members concur with release plans, and by producing a set of Lean RPR artifacts.

Core RPR Artifacts:

• Capabilities and Constraints (CAC)

• Project Oversight Plan (POP)

• Test Plan

• Team Process Agreement (TPA)

• Release Characteristics List

Other artifacts, such as Section 508 Compliance Determination Forms (CDF), may be required depending on the specific project, which will be established by agreement with the RPR Authority.

Team-Managed Deployment (TMD) Onboarding

TMD Onboarding is an IV&V process to validate a system’s capability to operate with high reliability, availability, and serviceability using robust automated build, test, and deployment practices. Systems should be on boarded to TMD when they fully satisfy outcomes 7, 8, 9, and 10 in this MI. Following TMD onboarding, RPR approval constitutes authorization for teams to deploy directly to production for up to six months. To minimize risk, teams are encouraged to deploy as often as multiple times per day, and must deploy to production at least every two weeks.

TMD requires ongoing communication and collaboration of development engineers, operations engineers, and business stakeholders. The following team agreements are required to facilitate effective teaming across the project community.

1. Product Owner Acceptance – The Product Owner retains full authority and responsibility for approving features deployed both through feature toggles and by direct code push to production. Teams are strongly encouraged to make this Product Owner approval a step in the continuous delivery pipeline.

2. Communications Agreements – Teams make agreements with key stakeholders regarding notifications before, during, and after deployment. Stakeholders include the user community, operations support engineers; help desk personnel, the Information System Security Officer (ISSO), Quality Assurance, and other impacted groups. Teams are encouraged to provide notifications via an Operations Monitoring Dashboard.

3. Monitoring – Teams prepare an Operations Monitoring Plan or Dashboard showing the practices, tools, and measures that will monitor applications in production. The plan will include an operations review schedule and escalation procedure when monitoring thresholds are breached. In lieu of a document, an Operations Monitoring Dashboard is the preferred long-term approach.

4. Documentation – Teams regularly and appropriately update the document set in accordance with their Program Oversight Plan (POP). Artifacts requiring regular updates may include a Pipeline Design Document, System Design Document or Wiki (SDD/W), Interface Control Agreements (ICAs), and Section 508 and Security Documentation. Teams are encouraged to use agile documentation approaches such as self-documenting code and tests expressed in a business-friendly language. Agreements regarding such approaches should be noted in the POP.

5. Periodic Audits – Teams make agreements for periodic audits of 508 compliance, security compliance, and other auditing oversight deemed necessary during the RPR.

USCIS OIT Applied Technology Division (ATD) will support the team in this effort by providing an independent assessment on pipeline suitability for TMD Onboarding. TMD is encouraged for all USCIS teams but granted on a contingent basis-provided the system remains in compliance.

Release Readiness Review for TMD Systems (TMD-RRR)

RRRs for TMD systems will be held periodically to approve the release of major new functionality to users through a deployment or feature toggle. The criteria and/or schedule for holding TMD-RRRs for a particular system will be determined according to risk, using the risk model described in the USCIS IV&V Policy, and will be documented in the system’s TPA. RRRs may also be held on demand by the CIO and on the advice of Quality Assurance, based on risk.

Legacy Release Readiness Review (RRR) and Electronic Release Readiness Review (eRRR)

Systems without TMD approval must hold a Release Readiness Review prior to each production deployment unless a waiver is granted by the Delivery Assurance Branch. An RRR may be conducted as a meeting or, per agreement with the RRR Authority, as a sequence of electronic approvals. In order to assess whether the current increment is ready to be deployed, the RRR Authority will assess whether the deployment was adequately tested, reviewed by the product owner and users, and is compliant with enterprise architecture, coding standards, Section 508, and security requirements. The RRR Authority also verifies that release activities were coordinated with business stakeholders and that the business is prepared for the impact of the release. Finally, the RRR Authority assesses the deployment and rollback plans and ensures the deployment package is ready to be submitted to applicable change control boards (CCBs). If the RRR Authority approves the release, it is then submitted to Change Control and deployed.

Core Deployment Artifacts (TMD-RRR, RRR, and eRRR):

• System Design Document or Wiki (SDD/W)

• Automated and Manual Build and Installation Scripts

• Automated and Manual Test Scripts

• Automated and Manual Deployment Scripts

• ICCB or Change Control Board Package

• Security Plan (SP)

• Security Assessment Report (SAR)

Other artifacts, such as Section 508 Compliance Determination Forms (CDF), may be required depending on the specific project, to be established by agreement with the RRR Authority.

Post Implementation Review (PIR) / Release Cycle End

During the PIR, the PIR Authority (USCIS CIO or designee) and the team will analyze the project’s successes and failures during the release cycle to identify improvements to the next release cycle. The review will include the Product Owner’s assessment of the business value generated during the release cycle, software quality measurements, Section 508 compliance, security compliance, and POAM resolution. The PIR also constitutes the formal end of a release for IUS purposes. The primary focus of the PIR, though, is to celebrate value that was delivered and identify continuous improvement opportunities. Teams should work with USCIS IV&V teams to provide an independent assessment of key measurements and outcomes of the release. Teams are encouraged to hold the PIR in conjunction with the RPR meeting for the next release cycle.

Additional Procedures

Lean-Agile-DevOps projects must conform to the USCIS policy IV&V in order to:

• Provide transparency and accountability to the public

• Inform management and oversight bodies with an independent view of what is working or not working in program execution, based on data and analysis

• Provide feedback to program executors to help them improve their processes

VII. Questions, Comments, and Suggestions

Please address any questions, comments, or suggestions to: USCIS-QA-TEAM@uscis.dhs.gov

VIII. Approval

Signed:__________________________ Date:__________________________

OFFICE OF INFORMATION TECHNOLOGY

Addendum to MI CIS-OIT-003 Management Instructions for Applying Lean-Agile-DevOps Principles

Effective Date: 1 April 2017

Management Instruction CIS-OIT-003 Appendix

Appendix A. Generally Accepted Agency Practices

In pursuit of the Required Outcomes called out in CIS OIT Management Instruction CIS-OIT- 003, the following generally accepted practices may be used to achieve successful software development. The Key Questions for each Required Outcome help identify and measure areas of improvement. Table 1 depicts the outcomes supported by each of the practices.

Teams planning to practice Team Managed Deployment (TMD) are expected to have a higher level of Agile discipline than the minimal guidelines.

The team-chosen enabling practices commonly include the following:

Delivery Cadence

The development teams deliver incremental improvements to the agency on a regular and frequent basis.

These deliveries of working functionality allow the work completed so far to be experienced, providing an unambiguous indicator of progress and a potential discovery of previously unknown needs.

• Deliver to production no less frequently than quarterly

• For TMD, deliver to production no less frequently than the development cadence, and potentially multiple times per day Delivery Environment

Development delivery must be able to provide value to the agency, either to allow stakeholders to experience the current system capabilities and limitations, or to end users for actual use.

• At minimum, to an internal environment where stakeholder can examine and evaluate the system

• Customarily to an environment that mimics production

• For TMD, to production use

Iterative, Incremental Development

Development should proceed in small slices of functionality. As development proceeds, existing functionality should be revisited to add additional or modify existing functionality (iterative). New or modified functionality should extend existing working functionality, leaving the whole in a working state (incremental).

• Development cadence of no longer than 4 weeks

• For TMD, development cadence no longer than 2 weeks

• Short enough for effectively steering the project

• Small increments of functionality are validated as they are developed

• Projects shall use time boxes or limited work in progress (WIP) policies to enforce short cadences for planning, completing, demonstrating, and deploying working tested features

• For TMD, validation of accumulated functionality is continually validated, mostly with automated checks, to enable development flow without regressions

Embedded Product Ownership

The direction of development, what functionality should be developed and in what order, should be embedded with the development team, authorized and available to make decisions as needed without delay.

• The Agency needs are represented by a single clear voice to development, dedicated to the development effort

• Product Owner has full authority to make timely decisions regarding development, prioritization, and acceptance of development

• Product Owner has full authority to make decisions about when functionality is deployed either by turning on a feature toggle or by direct code push to production

• Close collaboration between dedicated representative and actual stakeholders and users

• Teams are encouraged to include Product Owner approval as part of the continuous delivery pipeline

• For TMD, frequent feedback of the developing system from the actual stakeholders and users, informing future development, priorities, and fitness for purpose

Representation of Requirements

The documentation of requirements for development should be tuned to the needs of the development process and regarded as ephemeral. Any need to document beyond the development process should be regarded as separate and designed to meet that need.

• Explicit conditions of satisfaction that may be validated

• Acceptance criteria describing the intent

• Acceptance scenarios illustrating essential cases

• Use of low over-head, low fidelity assets such as user stories, augmented as needed with elaborations such as paper prototypes, or sample reports to convey the essential behavior

• Independent pieces capable of being sequenced in almost any order

• Small enough to easily fit within the delivery cadence

Automated Testing

In keeping with “test early, test often” principles, test criteria defined early in the life of a user story drives creation of automated test code that is stored in the version control system along with all other code. Automated tests should include appropriate testing such as unit testing, functionality, and system-to-system interfaces. While complete automated testing is desired, security and Section 508 accessibility testing will be automated when tools are available to support. Risk-based approaches should be used to determine which automated tests are included in regression suites.

• The explicit conditions of satisfaction determined in the requirements are automated as acceptance tests

• Functionality is typically tested over the smallest scope possible, and includes edge conditions

The Agile Testing Pyramid may be used to visualize this

• For TMD, high level of reliable automated testing

• For TMD, performance measures are tracked over time by the development teams

Fail-Safe

Concern should be paid to execution that may not proceed as desired and what consequences this will have. Negative consequences should be minimized and recovery procedures should be considered.

• System design shall anticipate environmental and implementation failures and mitigate the consequences

This may be monitored by the consequences and time-to-fix for production incidents

• For TMD, tests are treated as valuable as code, and gaps or failures are treated as first- class issues

• For TMD, perform as many infrastructure tasks as possible programmatically

• For TMD, it should be feasible to revert to the previous version, including database schema or data changes and environment

In close coordination with operations engineers, development teams implement and test methods to monitor, minimize, and correct unanticipated issues associated with deployments. Preferably, these methods are automated. These methods may include blue/green deployments, feature toggles, rollback scripts, and “fail forward” approaches that enable rapid replacement of faulty elements. Recovery methods should be executed quickly to minimize impact to data, system performance, and other critical aspects of production applications.

Extrinsic Quality

Care should be taken to keep the external quality of the system, as seen by the users, sufficiently high at all times to give correct operation and ease of use.

• Every feature is specified with one or more essential tests representing the intended functionality

• External expertise is engaged for extended verifications (e.g., security, accessibility) on a regular basis

• Testing activities happen within development cadence

• Testing capabilities are embedded within the teams

Intrinsic Quality

Care should be taken to keep the internal quality of the system, as seen by the developers, sufficiently high at all times to promote ease of development, understanding, and modification.

• System implementation shall not impede the addition or modification of functionality

• This may be measured in arrears by counting the number of modules that must be modified for a change

• As units of code are created, they are simultaneously tested for proper operation, resilience to unexpected inputs, and boundary conditions

• Unit tests document the code behavior intended by the programmer and verify that the code exhibits this behavior

Emergent Design

• The design of the system should be envisioned and realized over time.

• The more we work on the system, the better we understand the needs of the mission and the needs of the implementation context.

• At any given time, the system design must support current functionality without being overdesigned to support future functionality.

• Anticipated future needs of the system should be designed as needed.

• Care should be taken to keep such needs in mind so that they may be feasibly implemented when the time comes.

• Such an approach not only maximizes the realized benefit of the design, but leaves the agency best prepared for future changes in mission needs.

• Avoid building unused “hooks” anticipating future needs

• Keep the design simple so that future needs are easy to accommodate as they arise

Refactoring

As the needs of the design shift, it’s often necessary to modify existing code without changing its functionality.

This process is called Refactoring, a term attributed to William Opdyke and Ralph Johnson after their September 1990 article on the topic. The best known reference to this technique is Martin Fowler’s book, Refactoring. Improving the Design of Existing Code.

By reshaping code without changing its functionality, we can correct deficiencies we discover in our design or make the design amenable to new demands we place on it.

• Separate changes in code structure from changes in code functionality

• Use well-known refactoring techniques that are known to preserve functionality

• Make restructuring changes as a series of small changes, keeping the code functional at each step

Intentional Architecture

Make design decisions intentionally, rather than through expedience. Anticipate technical risks and design the architecture to meet them as they are addressed.

• Keep an eye on the long-term goals and technical issues as the design emerges

• Communicate the issues and current thinking on architectural approaches to all members of the development team

• Listen to any questions or objections concerning the suitability of the design

Managing Technical Debt

Ward Cunningham invented the term Technical Debt to describe the difference between how we currently understand the problem domain and how it is represented in our code. This difference naturally creates difficulties as we expand our coverage of the problem domain.

For example, we might model a domain construct as a hierarchical tree, but later find that some nodes are referred by more than one node. Our tree implementation cannot model that. Refactoring the code to a directed acyclic graph implementation will model our current understanding of the problem domain more directly.

Since then, others have come to use the phrase Technical Debt any perceived deficiency in the code design.

Whether using a strict or loose definition, it’s important to manage these deficiencies so that the code does not become difficult to maintain and extend.

• Keep duplication of functionality at a minimum

• Write code that expresses the concepts on the mind of the programmer, such that they are obvious to the next person to touch this code

• In object-oriented code, follow Robert C. “Uncle Bob” Martin’s SOLID design principles

• In all code, maximize cohesiveness of any module or grouping, and minimize its coupling to other modules or groupings

Version Control

Teams frequently commit working code to a USCIS-owned repository using an automated mechanism. In this context, code implies all system source code, configuration files, automated test scripts, build scripts, deployment scripts, or other computer files needed to build the system or the supporting Continuous Delivery pipeline.

• All code is version-controlled

• Developers and teams should integrate their code frequently

– Code is merged into common branch more frequently than development cadence

– Code from the common branch is frequently merged into each developer’s working copy, preferably multiple times a day

• Minimal time between introducing a change and other teams accounting for that change – For TMD, multiple times a day

• Documented procedures for build, test and deploy are version-controlled with the code

• It shall be feasible to retrieve and build any previous version, preferably by name, date or tag

• It shall be feasible to see the history of changes and to compare any two versions Scripted Builds

• Build processes should be well-defined so that they are repeatable as a matter of course.

• Build processes are scripted to allow anyone to build any portion of the system in a repeatable manner

• Build scripts should be version-controlled with the code

• Build scripts should contain segregated build steps for compiling, unit testing, producing deployable artifacts, and other desirable units of work

Automated Builds

To the extent feasible, build processes should not rely on manual intervention for execution. Human intervention should be reserved for decision making.

• Builds are performed via automated, script-driven retrieval of source code from a repository monitored by a dedicated build server

• Builds should run on code check in, on a set frequency, on demand, or any combination of these

• Builds should run a sequence of build scripts to compile, unit test, and produce deployable artifacts. Builds may automatically deploy artifacts and test them in situ

• Builds should complete within a short duration

The build server should produce appropriate build notifications and always present build status

Scripted Deployment

Deployment procedures should be well-defined so that they are repeatable as a matter of course.

• The same documented procedures are used to deploy to any environment, including Production

• For TMD, these procedures should be scripted and version-controlled

Deployment with minimal downtime for users

• Small releases

• Decoupled services

• Zero downtime (e.g., blue-green) deployments

• Database migration scripts

• Forward and backward compatibility of components

• Backout capability

Automated Deployment

To the extent possible, deployment scripts should not rely on manual intervention for execution.

• Teams maintain an automated process (or set of processes) that executes a list of deployment steps via script or via a deployment tool

• Automated deployments should be rapid, reliable, testable, and repeatable

• Steps include running acceptance tests, pushing code to downstream environments, and automated smoke testing

• Communication artifacts, such as tickets and release notes, should also be

automatically generated. Deployment configuration scripts should be stored as code and placed under configuration management

Approved Pipeline

The components and procedures to build, test and deploy software should be reliable and trusted.

• Teams implement a Continuous Delivery pipeline approved by USCIS.

– Pipeline components may already be in use at USCIS or, per agreement, may be emerging tools in the market that are new to USCIS

• For TMD, procedures for build, test and deploy are automated

System Monitoring

Systems should be monitored in production to detect problems in a timely fashion for quick action, and to provide the business with information about normal use.

• Teams shall have procedures and tools in place to monitor the performanc e and health of the system in production

• Key elements should be displayed in a dashboard viewable at any time

• Automated systems may monitor that operations are within define thresholds

• – Appropriate personnel should be alerted when thresholds are breached

• – Incident management and escalation procedures should be defined

Release Planning

Planning for future releases should provide guidance to external stakeholders while providing flexibility for the appropriate definition and delivery of system details.

• Adaptive Rolling Wave Planning to maintain a clear vision of immediate capability delivery in the context of a longer range view

Visibility of progress

The progress being made toward program goals should be easily visible and reflect the current reality.

• Visibility into team’s progress toward program goals (e.g., burn-up)

• Practices in place for communication and collaboration across teams (e.g., Scrum of Scrums, Portfolio Alignment Wall)

Peer Reviews

Avoid single points of failure by collaborating with others, filling in knowledge gaps, catching oversights, and considering a diverse set of options.

• Peers should review each other’s code, tests, and other development artifacts

• Reviews should attempt to identify system risks not caught by tests and automated

analysis

• Reviews should share information and development styles across the development team

Integrated Experimentation & Learning

In recognition that the beginning of a program is the point in time at which the least is known about it, experimentation and learning should be conducted to maximize improvement as development proceeds.

• team retrospectives at development cadence with tangible results

• Periodic program retrospectives over larger intervals and participants

• Capacity is allocated for experiments and improvements as a normal part of development

Culture of learning

In recognition that the majority of the time and effort in a program is spent learning what and how to do things, institutionalize learning as a major part of the program execution.

• Outside the development process, institute periodic sharing of technical and process learning

– Communities of Practice, Guilds, Brown Bags

Deployment History and Consistency

Place the highest value on meeting needs through the life of the program, demonstrating trustworthiness.

• Teams should demonstrate a record of successful deployments

– Success measures include avoidance of emergency conditions and post-release issues

– Should a problem occur as an aberration, teams should demonstrate an ability to eliminate the root cause, ensuring a one-time issue does not become a pattern of dysfunction