CHAPTER 14

Application Development and Deployment

In this chapter, you will

•  Examine secure development lifecycle models

•  Explore secure coding concepts

•  Learn to summarize secure application development and deployment concepts

Secure application development and deployment are foundational elements of cybersecurity. Poor application development practices lead to a greater number and severity of vulnerabilities, increasing the security problems for implementers.

Certification Objective   This chapter covers CompTIA Security+ exam objective 3.6, Summarize secure application development and deployment concepts.

Development Lifecycle Models

The production of software is the result of a process. There are a multitude of tasks, from gathering requirements, planning, design, coding, testing and support. These tasks are performed by a team of people according to a process model. There are several different process models that can be employed to facilitate the proper development steps in the proper order. Two of these, the waterfall model and the agile model, are examples of models and are discussed in the following section.

Waterfall vs. Agile

The waterfall model is a development model based on simple manufacturing design. The work process begins with the requirements analysis phase and progresses through a series of four more phases, with each phase being completed before progressing to the next phase—without overlap. This is a linear, sequential process, and the model discourages backing up and repeating earlier stages (after all, you can’t reverse the flow of a waterfall). Depicted in Figure 14-1, this is a simple model where the requirements phase precedes the design phase, which precedes the implementation phase, and so on through verification and maintenance. Should a new requirement “be discovered” after the requirements phase has ended, for example, it can be added, but the work does not go back to that phase. This makes the model very nonadaptive and difficult to use unless the developers implement a rigid method to ensure that each phase is truly completed before advancing to the next phase of the work. This can add to development time and cost. For these and other reasons, the waterfall model, although conceptually simple, is considered by most experts as nonworkable in practice.

Images

Figure 14-1   Waterfall development model

The waterfall methodology is particularly poorly suited for complex processes and systems where many of the requirements and design elements will be unclear until later stages of development. It is useful for small, bite-sized pieces, and in this manner is incorporated within other models such as the spiral, incremental, and Agile methods. One of the major weaknesses of the waterfall model is that it is difficult to incorporate late in the cycle changes from a customer, making the development process inflexible.

The agile model is not a single development methodology, but a whole group of related methods. Designed to increase innovation and efficiency of small programming teams, Agile methods rely on quick turns involving small increases in functionality. The use of repetitive, small development cycles can enable different developer behaviors, which in turn can result in more efficient development. There are many different methods and variations, but some of the major forms of Agile development are Scrum and Extreme Programming (XP). XP is built around the people side of the process, while Scrum is centered on the process perspective. More information on the foundations of the agile method, see the Agile Manifesto (http://agilemanifesto.org/).

Scrum

The Scrum programming methodology is built around a 30-day release cycle. It is highly dependent upon a prioritized list of high-level requirements, and program changes are managed on a 24-hour and 30-day basis. The concept is to keep the software virtually always ready for release. The master list of all tasks is called the product backlog. The 30-day work list is referred to as a sprint, and the chart that tracks what is accomplished daily is called the burn-down chart.

From a security perspective, nothing in the Scrum model prevents the application of secure programming practices. To include security requirements into the development process, they must appear on the product and sprint backlogs. This can be accomplished during the design phase of the project. As additions to the backlogs can occur at any time, the security team can make a set of commonly used user stories that support required security elements. The second method of incorporating security functionality is through developer training. Developers should be trained on security-related elements of programming, such as validating user input, using only approved libraries, and so forth.

The advantage of Scrum is that it enables quick turns of incremental changes to a software base. This makes change management easier. There are limitations in the amount of planning, but in a mature Agile environment, the security user stories can be already built and understood. The only challenge is ensuring that the security elements on the product and sprint backlogs get processed in a timely manner, but this is a simple management task. Security tasks tend to be less exciting than features, so keeping them in the process stack takes management effort.

XP

Extreme Programming is a structured process that is built around user stories. These stories are used to architect requirements in an iterative process that uses acceptance testing to create incremental advances. The XP model is built around the people side of the software development process and works best in smaller development efforts. Like other Agile methods, the idea is to have many small, incremental changes on a regular time schedule. XP stresses team-level communication, and as such, is highly amenable to the inclusion of security methods, if the development team includes one or more security-conscious developers.

Images

EXAM TIP    Understanding the differences between Agile and waterfall development, including when each is advantageous, is testable.

Secure DevOps

DevOps is a combination of development and operations, and a blending of tasks performed by a company’s application development and systems operations teams. DevOps emphasizes communication and collaboration between product management, software development, and operations professionals in order to facilitate continuous development, continuous integration, continuous delivery, and continuous monitoring processes. DevOps can be considered the anti-waterfall model, for rather than going from phase to phase, in DevOps, as small changes are ready to advance, they advance. This leads to many small, incremental changes, but also equates to less time between updates and less time to fix or change things. Secure DevOps is the addition of security steps to the DevOps process. Just as you can add security steps to the waterfall model, or any other software development model, you can add them to DevOps as well, promoting a secure DevOps outcome.

Security Automation

One of the key elements of DevOps is automation. DevOps relies upon automation for much of its efficiencies. Security automation can do the same for security with respect to improving efficiencies that automation has in DevOps. Automating routine and extensive security processes allows fewer resources to cover more environment in a more effective and efficient manner. Automation removes the manual labor that costs money to employ, especially skilled cybersecurity personnel. And rather than replacing the personnel with scripts, the use of automation allows the personnel to spend their time doing value-added work such as analysis. Take the issues associated with patching systems. One has to identify which patches belong on which systems, apply the patches and then verify periodically that the system is working and systems are patched. All of these steps can be highly automated, making a small group capable of patching and monitoring patch levels on a large base of systems.

Continuous Integration

Continuous integration is the DevOps manner of continually updating and improving the production code base. By using high levels of automation, and safety nets of automated backout routines, continuous integration allows the DevOps team to test and update even very minor changes without a lot of overhead. Instead of running a few large updates, with many integrated and potentially cross-purpose update elements, all squeezed into a single big package, the DevOps team runs a series of smaller, single-purpose integrations throughout the process. This means that when testing, the team can isolate the changes to a small, manageable number, without the significance of multiple potential interactions. This can make DevOps more secure by reducing interaction errors and other errors that are difficult to detect and time consuming to track down.

Baselining

Baselining is the process of determining a standard set of functionality and performance. This is a metrics-driven process, where later changes can be compared to the baseline to gauge their impact on performance and other variables. If a change improves the baseline elements in a positive fashion, a new baseline can be established. If the new values are of lesser quality, then a decision can be made as to accept the changes or change the baseline. It is through baselining that performance and feature creep are countered by the management team. If a new feature negatively impacts performance, then the new feature might be withheld. Baselining is important to DevOps and security in general as it provides a reference point when making changes. Without the reference point, it is hard to show changes are improvements. Development teams should baseline systems at time of development, and at periods of major changes.

Immutable Systems

An immutable system is a system that, once deployed, is never modified, patched, or upgraded. If a patch or update is required, the system is merely replaced with a new system that is patched and updated. In a typical system, one that is mutable or changeable, that is patched and updated before deployment, it is extremely difficult to know conclusively if future changes to the system are authorized or not, and if they are correctly applied or not. Linux makes this determination especially difficult. On a Linux system, the binaries and libraries are scattered over many directories: /boot, /bin, /usr/bin, /lib, /usr/lib, /opt/bin, /usr/local/bin, and many more. Configuration files are similarly scattered over /etc, /opt/etc, /usr/local/etc, /usr/lib, and so on. These directories have some files that should never be modified and others that are regularly updated. When the system update services run, they often create temporary files in these directories as well. Consequently, it is very difficult to lock down all these directories and perform authorized system and software updates at the same time. Immutable systems resolve these issues.

Infrastructure as Code

Infrastructure as code is the use of code to build systems, rather than manually configuring them via normal configuration mechanisms. It is a way of using automation to build out systems, reproducible, efficient and is a key attribute of enabling best practices in DevOps. Developers become more involved in defining system configuration, and the Ops team gets more involved in the actual development process. The objective is to avoid having developers write applications and toss them over a wall to the implementers, the Ops team, and expect them to make the applications work in the environment. As systems have become larger, more complex, and interrelated, interconnecting developer input and production input has created an environment of infrastructure as code, a version of Infrastructure as a Service.

Images

EXAM TIP    Understand how DevOps interacts with and can be supported by a secure development lifecycle. Also understand the major methods, such as immutable systems and continuous integration, and where they can be employed effectively to secure DevOps.

Version Control and Change Management

Programs are developed, released, used and then changes are desired, either to change functionality, fix errors, or improve performance. This leads to multiple versions of programs. Version control is as simple as tracking which version of a program is being worked on, whether in development, testing, or production. Version control systems tend to use primary numbers to indicate major releases, and numbers after a decimal point to indicate minor changes.

Having the availability of multiple versions brings into focus the issue of change management, which addresses how an organization manages which versions are currently being used, and how it coordinates changes as they are released by a manufacturer.

Tracking version numbers, and bug fixes, including what is being fixed, with the why and how behind the changes is important documentation for the internal development team. Advanced teams keep track not only of what went wrong, but the root cause analysis—how did it get past testing. This documentation is important to ensure that later code changes do not reintroduce old vulnerabilities into the code base.

In traditional software publishing, a new version required a new install and fairly significant testing, as the level of change in a new version could be drastic and introduce issues of compatibility, functionality, and even correctness. DevOps turned the tables on this equation by introducing the idea that developers and production work together to create, in essence, a series of micro releases, so that any real problems are associated with single changes and not bogged down by interactions between multiple module changes.

Whether you are operating in the traditional world or in the DevOps world, you need a change management process that ensures that all changes in production are authorized, properly tested, and, in case of failure, rolled back. It should also ensure that accurate documentation is produced and kept up to date.

Provisioning and Deprovisioning

Provisioning is the process of assigning to users permissions or authorities to access objects. Users can be provisioned into groups, enabling them to be managed as a group rather than individually. Computer processes or threads can be provisioned to operate at higher levels of authority when executing, and best practice includes removing higher levels of permission when not needed, to reduce the number of threads at elevated privilege. Deprovisioning is the removal of permissions or authorities. In secure coding, the practice is to provision a thread to an elevated execution permission level (e.g., root) only during the time that the administrative permissions are needed. After those steps have passed, the thread can be deprovisioned to a lower access level. This combination shortens the period of time an application is at an increased level of authority, reducing the risk exposure should the program get hijacked or hacked.

Secure Coding Techniques

Application security begins with code that is secure and free of vulnerabilities. Unfortunately, all code has weaknesses and vulnerabilities, so instantiating the code in a manner that has effective defenses preventing the exploitation of vulnerabilities can maintain a desired level of security. Proper handling of configurations, errors and exceptions, and inputs can assist in the creation of a secure application. Testing of the application throughout the software development lifecycle (SDLC) can be used to determine the actual security risk profile of a system.

There are numerous individual elements in a Software Development Life Cycle Methodology (SDLM) that can assist a team in developing secure code. Correct SDLM processes, such as input validation, proper error and exception handling, and cross-site scripting and cross-site request forgery mitigations, can improve the security of code. Process elements such as security testing, fuzzing, and patch management also help to ensure applications meet a desired risk profile.

There are two main enumerations of common software errors: the Top 25 list maintained by MITRE, and the OWASP Top Ten list for web applications. Depending on the type of application being evaluated, these lists provide a solid starting point for security analysis of known error types. MITRE is the repository of the industry-standard list for standard programs, and OWASP for web applications. As the causes of common errors do not change quickly, these lists are not updated every year.

Proper Error Handling

Every application will encounter errors and exceptions, and these need to be handled in a secure manner. One attack methodology includes forcing errors to move an application from normal operation to exception handling. During an exception, it is common practice to record/report the condition, including supporting information such as the data that resulted in the error. This information can be invaluable in diagnosing the cause of the error condition. The challenge is in where this information is captured. The best method is to capture it in a log file, where it can be secured by an ACL. The worst case is when it is echoed to the user. Echoing error condition details to users can provide valuable information to attackers when they cause errors on purpose.

Images

EXAM TIP    All errors/exceptions should be trapped and handled in the generating routine.

Improper exception handling can lead to a wide range of disclosures. Errors associated with SQL statements can disclose data structures and data elements. Remote procedure call (RPC) errors can give up sensitive information such as filenames, paths, and server names. Programmatic errors can give up line numbers that an exception occurred on, the method that was invoked, and information such as stack elements.

Proper Input Validation

With the move to web-based applications, the errors have shifted from buffer overflows to input-handling issues. Users have the ability to manipulate input, so it is up to the developer to handle the input appropriately to prevent malicious entries from having an effect. Buffer overflows could be considered a class of improper input, but newer attacks include canonicalization attacks and arithmetic attacks. Probably the most important defensive mechanism that can be employed is input validation. Considering all inputs to be hostile until properly validated can mitigate many attacks based on common vulnerabilities. This is a challenge, as the validation efforts need to occur after all parsers have completed manipulating input streams, a common function in web-based applications using Unicode and other international character sets.

Proper input validation is especially well suited for the following vulnerabilities: buffer overflow, reliance on untrusted inputs in a security decision, cross-site scripting (XSS), cross-site request forgery (XSRF), path traversal, and incorrect calculation of buffer size. Input validation may seem suitable for various injection attacks, but given the complexity of the input and the ramifications from legal but improper input streams, this method falls short for most injection attacks. What can work is a form of recognition and whitelisting approach, where the input is validated and then parsed into a standard structure that is then executed. This restricts the attack surface to not only legal inputs, but also expected inputs.

Images

EXAM TIP    Consider all input to be hostile. Input validation is one of the most important secure coding techniques employed, mitigating a wide array of potential vulnerabilities.

Output validation is just as important in many cases as input validation. If querying a database for a username and password match, the expected forms of the output of the match function should be either one match or none. If using record count to indicate the level of match, which is a common practice, then a value other than 0 or 1 would be an error. Defensive coding using output validation would not act on values >1, as these are clearly an error and should be treated as a failure.

Normalization

Normalization is an initial step in the input validation process. Specifically, it is the step of creating the canonical form, or simplest form, of a string before processing. Strings can be encoded using Unicode and other encoding methods. This makes byte-by-byte comparisons meaningless when trying to screen user input of strings. Checking to see if the string is “rose” can be difficult when: A Rose, is a rose, is a r%6fse (all of these represent the same string, just different forms). The process of normalization converts all of these to rose, where it can then be screened as valid input.

Different libraries exist to assist developers in performing this part of input validation. Developers should always normalize their inputs prior to validation steps to remove Unicode and other encoding issues. Per the Unicode standard, “when implementations keep strings in a normalized form, they can be assured that equivalent strings have a unique binary representation.”

Stored Procedures

Stored procedures are precompiled methods implemented within a database engine. Stored procedures act as a secure coding mechanism because they offer an isolation of user input from the actual SQL statements being executed. This is the primary defense mechanism against SQL injection attacks, separation of user input from the SQL statements. User-supplied input data is common in interactive applications that use databases. This input can allow the user to define the specificity of search, match, and so forth. But what cannot happen is to allow a user to write the actual SQL code that is executed. There are too many things that could go wrong, too much power to allow a user to directly wield it, and eliminating SQL injection attacks by “fixing” input has never worked.

All major database engines support stored procedures. Stored procedures have a performance advantage over other forms of data access. The downside is that stored procedures are written in another language, SQL, and a database programmer typically is needed to implement the more complex ones.

Code Signing

An important factor in ensuring that software is genuine and has not been altered is a method of testing the software integrity. With software being updated across the Web, how can you be sure that the code received is genuine and has not been tampered with? The answer is a process known as code signing, which involves applying a digital signature to code, providing a mechanism where the end user can verify the code integrity. In addition to verifying the integrity of the code, digital signatures provide evidence as to the source of the software. Code signing rests upon the established public key infrastructure (PKI). To use code signing, a developer needs a key pair. For this key, the public key, to be recognized by the end user, it needs to be signed by a recognized certificate authority.

Encryption

Encryption is one of the elements where secure coding techniques have some unique guidance: “never roll your own crypto.” This not only means you should not write your own cryptographic algorithms, but also means you should not attempt to implement standard algorithms by yourself. Vetted, proven cryptographic libraries exist for all major languages, and the use of these libraries is considered best practice. The guidance has a variety of interrelated rationales, but the simple explanation is that crypto is almost impossible to invent, and very hard to implement correctly. Thus, to have usable, secure encryption in your application, you need to adopt proven algorithms and utilize proven code bases.

Obfuscation/Camouflage

Obfuscation or camouflage is the hiding of obvious meaning from observation. While obscurity is not considered adequate security under most circumstances, adding obfuscation or camouflage to a system to make it harder for an attacker to understand and exploit is a good thing. Numbering your e-mail servers email1, email2, email3, . . . tells an attacker what namespace to explore. Removing or hiding these hints makes the work harder and offers another layer of protection.

This works well for data names and other exposed elements that have to be exposed to the outside. Where this does not work well is in the construction of code. Obfuscated code, or code that is hard or even nearly impossible to read, is a ticking time bomb. The day will come that someone will need to read the code to figure out how it works, either to modify it or to fix it if it is not working. If programmers have issues reading and understanding the code, how it functions, and what it is supposed to do, how can they contribute to its maintenance?

Code Reuse/Dead Code

Modern software development includes extensive reuse of components. From component libraries to common functions across multiple components, there is significant opportunity to reduce development time and costs through code reuse. This can also simplify a system through the reuse of known elements. The downside of massive code reuse is that failure of a widely reused code component has a ripple effect across many applications.

During the design phase, the development team should make decisions as to the appropriate level of reuse. For some complex functions, such as cryptography, reuse is the preferred path. In other cases, where the lineage of a component cannot be established, then the risk of use may outweigh the benefit. Additionally, the inclusion of previously used code, sometimes referred to as legacy code, can reduce development efforts and risk.

Images

EXAM TIP    The use of legacy code in current projects does not exempt that code from security reviews. All code should receive the same scrutiny, especially legacy code that may have been developed prior to the adoption of SDLC processes.

Dead code is code that while it may be executed, the results that it produces are never used elsewhere in the program. There are compiler options that can remove dead code, called dead code elimination, but you must use these options with care. Assume you have a section of code that you put in specifically to set a secret value to all zeros. The logic is as follows: generate secret key, use secret key, set secret key to zero. You do this last step to remove the key from memory and keep it from being stolen. But along comes the dead code removal routine. It sees that you set the value of secretkey == 0, but then you never use it again. So the compiler, in optimizing your code, removes your protection step.

Server-Side vs. Client-Side Execution and Validation

Input validation can be performed either on the server side of a client-server architecture or on the client side. In all client-server and peer-to-peer operations, one universal truth applies: never trust input without validation. Systems that are designed and configured without regard to this truth are subject to client-side attacks. Systems can be designed where the client has the functionality needed to assure input veracity, but there is always the risk that the client can become corrupted, whether by malware, a disgruntled user, or simple misconfiguration. The veracity of client-side execution actions cannot be guaranteed. Server-side execution of code can be secured making it the preferred location for sensitive operations such as input validation.

The lure of doing validation on the client side is to save the round-trip communication time, especially for input errors such as missing values. Applications commonly have client-side code to validate the input as correct in terms of it being complete and approximately correct. This validation on the client side does not mean that the data is safe to use, only that it appears that the data has been completely filled in. All input validation with respect to completeness, correctness, and security checks must be done on the server side, and must be done before the user input is used in any way.

Images

EXAM TIP    Attackers have significant advantages on clients because of access issues. Because one cannot guarantee the security of the environment of client machines and code, all sensitive operations should be performed on server-side environments.

Memory Management

Memory management comprises the actions used to control and coordinate computer memory, assigning memory to variables and reclaiming it when it is no longer being used. Errors in memory management can result in a program that has a memory leak, and the leak can grow over time, consuming more and more resources. The routine to clean up memory that has been allocated in a program but is no longer needed is called garbage collection. In the C programming language, where there is no automatic garbage collector, the programmer must allocate and free memory explicitly. One of the advantages of newer programming languages such as Java, C#, Python, and Ruby is that they provide automatic memory management with garbage collection. This may not be as efficient as when specifically coded in C, but it is significantly less error prone.

Use of Third-Party Libraries and SDKs

Programming today is, to a great extent, an exercise in using third-party libraries and software development kits (SDKs). This is because once code has been debugged and proven to work, rewriting it is generally not a valuable use of time. Also, some fairly complex routines, such as encryption, have vetted, proven library sets that remove a lot of risk from programming these functions. Using these proven resources can reduce errors and vulnerabilities in code, making this a positive move for secure development. Using third-party elements brings baggage in that is code you have not developed and don’t necessarily have all the dependency details. If the development team manages dependencies correctly, the benefits greatly outweigh the risks.

Data Exposure

Data exposure is the loss of control over data from a system during operations. Data must be protected during storage (data at rest), during communication (data in transit), and at times during use. It is up to the programming team to chart the flow of data through a system and ensure that it is protected from exposure throughout the process. Exposed data can be lost to unauthorized parties (a failure of confidentiality) or, equally dangerous, can be changed by an unauthorized party (a failure of integrity). Protection of the data will typically be done using various forms of cryptography, which is covered in Chapter 26.

Images

EXAM TIP    The list of elements under secure coding techniques is long and specific. It is important to understand the differences among the techniques so that you can recognize which one best fits the context of the question.

Code Quality and Testing

When coding operations commence, application developers can use tools and techniques to assist them in the assessment of the security level of the code under development. They can analyze code either statically or dynamically to find weaknesses and vulnerabilities. Manual code reviews by the development team can provide benefits both to the code and the team. Code quality does not end with development, as the code needs to be delivered and installed both intact and correctly on the target system.

Code analysis encompasses the processes used to inspect code for weaknesses and vulnerabilities. Code analysis can be divided into two forms: static and dynamic. Static analysis involves examination of the code without execution. Dynamic analysis involves the execution of the code as part of the testing. Both static and dynamic analyses are typically performed with tools, which are much better at the detailed analysis steps needed for any but the smallest code samples.

Code testing is the verification that the code meets to functional requirements as laid out in the requirements process. While code analysis makes certain the code works properly doing what it is supposed to do and only what it is supposed to do, code testing makes certain it meets the business requirements.

Code analysis can be performed at virtually any level of development, from unit level to subsystem to system to complete application. The higher the level, the greater the test space and more complex the analysis. When the analysis is done by teams of humans reading the code, typically at the smaller unit level, it is referred to as a code review. Code analysis should be done at every level of development, because the sooner that weaknesses and vulnerabilities are discovered, the easier they are to fix. Issues found in design are cheaper to fix than those found in coding, which are cheaper to fix than those found in final testing, and all of these are cheaper to fix than errors discovered after the software has been deployed.

Static Code Analyzers

Static code analysis is when the code is examined without being executed. This analysis can be performed on both source code and object code bases. The term “source code” is typically used to designate the high-level language code, although technically, source code is the original code base in any form, from high-level language to machine code. Static analysis can be performed by humans or tools, although humans are limited to the high-level language, while tools can be used against virtually any form of code base.

Static code analysis is frequently performed using automated tools. These tools are given a variety of names, but are commonly called static code analyzers or source code analyzers. Sometimes, extra phrases, such as “binary scanners” or “byte code scanners,” are used to differentiate the tools. Static tools use a variety of mechanisms to search for weaknesses and vulnerabilities. Automated tools can provide advantages when checking syntax, approved function/library calls, and examining rules and semantics associated with logic and calls. They can catch elements a human could overlook.

Dynamic Analysis (e.g., Fuzzing)

Dynamic analysis is performed while the software is executed, either on a target system or an emulated system. The system is fed specific test inputs designed to produce specific forms of behaviors. Dynamic analysis can be particularly important on systems such as embedded systems, where a high degree of operational autonomy is expected. As a case in point, the failure to perform adequate testing of software on the Ariane rocket program led to the loss of an Ariane V booster during takeoff. Subsequent analysis showed that if proper software testing had been performed, the error conditions could have been detected and corrected without the loss of the flight vehicle. Many times you can test software in use without the rest of the system, and for some use cases, where failure costs are high, extensive testing before actual use is standard practice.

Dynamic analysis requires specialized automation to perform specific testing. Among the tools available are dynamic test suites designed to monitor operations for programs that have high degrees of parallel functions, thread-checking routines to ensure multicore processors and software are managing threads correctly, and programs designed to detect race conditions and memory addressing errors.

Fuzzing (or fuzz testing) is a brute force method of addressing input validation issues and vulnerabilities. The basis for fuzzing a program is the application of large numbers of inputs to determine which ones cause faults and which ones might be vulnerable to exploitation. Fuzz testing can be applied to anywhere data is exchanged to verify that input validation is being performed properly. Network protocols can be fuzzed, file protocols can be fuzzed, and web protocols can be fuzzed. The vast majority of browser errors are found via fuzzing.

Fuzz testing works well in white, black, or gray box testing, as it can be performed without knowledge of the specifics of the application under test. Fuzz testing works by sending a multitude of input signals and seeing how the program handles them. Specifically, malformed inputs can be used to vary parser operation and to check for memory leaks, buffer overflows, and a wide range of input validation issues. Since input validation errors are one of the top issues in software vulnerabilities, fuzzing is the best method of testing against these issues, such as cross-site scripting and injection vulnerabilities.

There are several ways to classify fuzz testing. It can be classified as smart testing or dumb testing, indicating the type of logic used in creating the input values. Smart testing uses knowledge of what could go wrong, and malforms the inputs using this knowledge. Dumb testing just uses random inputs.

Images

EXAM TIP    Fuzz testing is a staple of SDLC-based testing, finding a wide range of errors with a single test method.

Fuzz testing also can be classified as generation-based or mutation-based. Generation-based fuzz testing uses the specifications of input streams to determine the data streams that are to be used in testing. Mutation-based fuzz testing takes known good traffic and mutates it in specific ways to create new input streams for testing. Each of these has its advantages, and the typical fuzzing environment involves both used together.

Stress Testing

The typical objective in performance testing is not to find specific bugs, but rather to determine bottlenecks and performance factors for the systems under test. These tests are frequently referred to as load testing and stress testing. Load testing involves running the system under a controlled speed environment. Stress testing takes the system past this operating point to see how it responds to overload conditions. One of the reasons stress testing is performed on software under development is to determine the service levels that can be expected from the software in a production environment. Typically, these are expressed in the terms of a service level agreement (SLA).

Sandboxing

Sandboxing refers to the execution of computer code in an environment designed to isolate the code from direct contact with the target system. Sandboxes are used to execute untrusted code, code from guests, and unverified programs. Sandboxes work like a virtual machine (VM) and can mediate a wide range of system interactions, from memory access to network access, and access to other programs, the file system, and devices. The level of protection offered by a sandbox depends upon the level of isolation and mediation offered.

Model Verification

Ensuring the code does what the code is supposed to do, verification, is more complex than just running the program and looking for runtime errors. The program results for a given set of inputs need to match the expected results per the system model. For instance, if applying a simple mathematical operation, is the calculation correct? This is simple to verify on a case-by-case basis, but when a program has many interdependent calculations, verifying that the result matches the desired design model can be a fairly complex task.

Validation and verification are the terms used to describe this testing. Validation is the process of checking whether the program specification captures the requirements from the customer. Verification is the process of checking that the software developed meets the model specification. Performing model verification testing is important, as this is the assurance that the code as developed meets the design requirements.

Images

EXAM TIP    Understand the different quality and testing elements so that you can apply the correct one to the context of a question.

Compiled vs. Runtime Code

Compiled code is code that is written in one language, then run through a compiler and transformed into executable code that can be run on a system. Compilers can do many things to optimize code and create smaller, faster-running programs on the actual hardware. But compilers have problems with dynamic code capable of changing at runtime. Interpreters create runtime code that can be executed via an interpreter engine, like a Java virtual machine (JVM), on a computer system. Although slower than compilers in execution, there are times that interpreters excel. To run a program with a compiler, the compiler first has to compile the source program into the target program and then load and execute the target program. These steps must all occur, and can take time. With an interpreter, the interpreter manages the conversion of the high-level code into the machine code on the fly, removing the compile steps. So, while an interpreter may be slow at running the code, if a lot of changes are happening that force recompiles, it can be faster.

In today’s world, we have both compilers and interpreters for most languages, so that the correct tool can be used for the correct situation. We also have systems such as just-in-time compilers and bytecode interpreters that blur the traditional categorizations of compilers and interpreters.

Chapter Review

In this chapter, you became acquainted with secure application development and deployment concepts. The chapter opened with the waterfall and Agile development models. From there it moved into a discussion of secure DevOps. The topics under secure DevOps included security automation, continuous integration, baselining, immutable systems, and infrastructure as code. The chapter finished this segment with version control, change management, provisioning, and deprovisioning.

The chapter then looked at secure coding techniques, beginning with proper error handling and proper input validation, followed by a discussion of normalization and stored procedures. Code signing, encryption, and obfuscation/camouflage were covered next. The secure coding section wrapped up with a look at code reuse/dead code, server-side versus client-side execution, memory management, use of third-party libraries, and data exposure.

The chapter then explored code quality and testing, including the topics of static code analyzers, dynamic analysis (fuzzing), stress testing, sandboxing, and model verification. The chapter concluded with an examination of compiled code versus runtime code.

Questions

To help you prepare further for the CompTIA Security+ exam, and to test your level of preparedness, answer the following questions and then check your answers against the list of correct answers at the end of the chapter.

1. Which of the following methodologies progresses through a series of phases, with each phase being completed before progressing to the next phase?

A. Scrum

B. Waterfall

C. Agile

D. Extreme Programming (XP)

2. Which of the following methodologies is a structured process that is built around user stories that are used to architect requirements in an iterative process that uses acceptance testing to create incremental advances?

A. Agile

B. Scrum

C. Extreme Programming (XP)

D. Waterfall

3. Which of the following are elements of software development that will help to improve the security of code? (Choose all that apply.)

A. Input validation

B. Proper error and exception handling

C. Cross-site scripting mitigations

D. Patch management

4. Where should all errors/exceptions be trapped and handled?

A. In the main program or routing that called the routine that generated the error/exception

B. In the generating routine itself

C. In a special routine designed to handle all errors/exceptions

D. In a separate routine designed to handle each specific error/exception

5. Which of the following is a system that, once deployed, is never modified, patched, or upgraded?

A. Baseline

B. Immutable system

C. Frozen system

D. Fixed configuration

6. What is the term used to describe removing users’ permissions or authorities to objects?

A. Provisioning

B. Version control

C. Change management

D. Deprovisioning

7. The process describing how an organization manages which versions are currently being used, and how it coordinates updates or new versions as they are released by a manufacturer, is known as which of the following?

A. Version control

B. Provisioning

C. Change management

D. Deprovisioning

8. Which of the following is an initial step in the input validation process that creates the canonical form, or simplest form, of a string before processing?

A. Implementing stored procedures

B. Code signing

C. Code reuse

D. Normalization

9. Which of the following is true about what is known as dead code?

A. Dead code is code that is never executed and thus can be removed from the program without a negative impact.

B. Dead code is code that is never executed but should remain in the program because removing it may have unintended consequences.

C. Dead code is code that while it may be executed, the results that it produces are never used elsewhere in the program. There are compiler options that can remove dead code, which is called dead code elimination, but these must be used with care because dead code elimination may have unintended consequences.

D. Dead code is code that while it may be executed, the results that it produces are never used elsewhere in the program. It should be removed through automated or manual means to improve the program.

10. What is the term used to describe the loss of control over data from a system during operations?

A. Sandboxing

B. Data exposure

C. Data breach

D. Runtime release

11. What term is used to refer to testing a system under a controlled speed environment?

A. Load testing

B. Stress testing

C. Sandboxing

D. Static code analysis

12. Fuzz testing works best in which of the following testing environments?

A. White box testing

B. Gray box testing

C. Black box testing

D. Fuzz testing works equally well in all of the above.

13. Code analysis can be performed at which of the following levels of development? (Choose all that apply.)

A. Unit level

B. Subsystem level

C. System level

D. Complete application

14. Which code analysis method is performed while the software is executed, either on a target system or an emulated system?

A. Static analysis

B. Runtime analysis

C. Sandbox analysis

D. Dynamic analysis

15. Which of the following is true concerning verification? (Choose all that apply.)

A. Ensuring the code does what the code is supposed to do, verification, is more complex than just running the program and looking for runtime errors.

B. Verification also checks whether the program specification captures the requirements from the customer.

C. Verification is simple on a case-by-case basis, but when a program has many interdependent calculations, verifying that the results match the desired design model can be a fairly complex task.

D. Verification is the process of checking that the software developed meets the model specification.

Answers

1. B. The waterfall model is a development model based on simple manufacturing design. The work process begins with the requirements analysis phase and progresses through a series of four more phases, with each phase being completed before progressing to the next phase. The Scrum programming methodology is built around a 30-day release cycle. The Agile model is not a single development methodology, but a whole group of related methods. Designed to increase innovation and efficiency of small programming teams, Agile methods rely on quick turns involving small increases in functionality. Extreme Programming is a structured process that is built around user stories. These stories are used to architect requirements in an iterative process that uses acceptance testing to create incremental advances.

2. C. Extreme programming (XP) is a structured process that is built around user stories. These stories are used to architect requirements in an iterative process that uses acceptance testing to create incremental advances. Agile methods are not a single development methodology, but a whole group of related methods. Designed to increase innovation and efficiency of small programming teams, Agile methods rely on quick turns involving small increases in functionality. The waterfall model is a development model based on simple manufacturing design. The work process begins with the requirements analysis phase and progresses through a series of four more phases, with each phase being completed before progressing to the next phase. The Scrum programming methodology is built around a 30-day release cycle.

3. A, B, and C. All are elements of software development that will help to improve the security of code. While patch management is an important aspect of security, it occurs after code development and delivery and is considered a process element and not a part of the software development lifecycle.

4. B. All errors/exceptions should be trapped and handled in the generating routine.

5. B. An immutable system is a system that, once deployed, is never modified, patched, or upgraded. If a patch or update is required, the system is merely replaced with a new system that is patched and updated. Baselining is the process of determining a standard set of functionality and performance. This is a metrics-driven process, where later changes can be compared to the baseline to gauge their impact on performance and other variables. If a change improves the baseline elements in a positive fashion, a new baseline can be established. The other terms are not commonly used in industry.

6. D. Deprovisioning is the removal of users’ permissions or authorities to access objects. Provisioning is the process of assigning to users permissions or authorities to access objects. Version control is as simple as tracking which version of a program is being worked on, whether in development, testing, or production. Change management addresses how an organization manages which versions are currently being used, and how it coordinates changes as they are released by a manufacturer.

7. C. Change management addresses how an organization manages which versions are currently being used, and how it coordinates changes as they are released by a manufacturer. Version control is as simple as tracking which version of a program is being worked on, whether in development, testing, or production. Provisioning is the process of assigning permissions or authorities to objects for users. Deprovisioning is the removal of permissions or authorities to objects for users.

8. D. Normalization is an initial step in the input validation process. Specifically, it is the step of creating the canonical form, or simplest form, of a string before processing. Stored procedures are precompiled methods implemented within a database engine. Stored procedures act as a secure coding mechanism because they offer an isolation of user input from the actual SQL statements being executed. Code signing involves applying a digital signature to code, providing a mechanism where the end user can verify the code integrity. Code reuse is reusing code from one application to another.

9. C. Dead code is code that while it may be executed, the results that it obtains are never used elsewhere in the program. There are compiler options that can remove dead code, called dead code elimination, but these options must be used with care because dead code elimination may have unintended consequences.

10. B. Data exposure is the loss of control over data from a system during operations. Sandboxing refers to the execution of computer code in an environment designed to isolate the code from direct contact with the target system. A data breach occurs when an unauthorized user gains access to your system and its data. Runtime release is not a term used in the industry.

11. A. Load testing involves running the system under a controlled speed environment. Stress testing takes the system past this operating point to see how it responds to overload conditions. Sandboxing refers to the execution of computer code in an environment designed to isolate the code from direct contact with the target system. Static code analysis is when the code is examined without being executed.

12. D. Fuzz testing works well in white, black, or gray box testing, as it can be performed without knowledge of the specifics of the application under test.

13. A, B, C, and D. Code analysis can be performed at virtually any level of development, from unit level to subsystem to system to complete application.

14. D. Dynamic analysis is performed while the software is executed, either on a target system or an emulated system. Static code analysis is when the code is examined without being executed. Sandboxing refers to the execution of computer code in an environment designed to isolate the code from direct contact with the target system. Runtime analysis is descriptive of the type of analysis but is not the term used in industry.

15. A, C, and D. Ensuring the code does what the code is supposed to do, verification, is more complex than just running the program and looking for runtime errors. The program results for a given set of inputs need to match the expected results per the system model. For instance, if applying a simple mathematical operation, is the calculation correct? This is simple to verify on a case-by-case basis, but when a program has many interdependent calculations, verifying that the result matches the desired design model can be a fairly complex task. Verification is the process of checking that the software developed meets the model specification. Validation is the process of checking whether the program specification captures the requirements from the customer.