16
Security Testing
In this chapter you will
• Explore the different types of security tests
• Learn about using scanning and penetration testing to find vulnerabilities
• Examine fuzz testing for vulnerabilities
• Examine security models used to implement security in systems
• Explore the types of adversaries associated with software security
When testing for vulnerabilities, a variety of techniques can be used to examine the software under development. From generalized forms of testing, such as scanning and fuzzing, to more specific methods, such as penetration testing and cryptographic testing, different tools and methods can provide insights as to the locations and levels of security vulnerabilities in the software.
Scanning
Scanning is automated enumeration of specific characteristics of an application or network. These characteristics can be of many different forms, from operating characteristics to weaknesses or vulnerabilities. Network scans can be performed for the sole purpose of learning what network devices are available and responsive. Systems can be scanned to determine the specific operating system (OS) in place, a process known as OS fingerprinting. Vulnerability scanners can scan applications to determine if specific vulnerabilities are present.
Scanning can be used in software development to characterize an application on a target platform. It can provide the development team with a wealth of information as to how a system will behave when deployed into production. There are numerous security standards, including the Payment Card Industry Data Security Standard (PCI DSS), that have provisions requiring the use of scanners to identify weaknesses and vulnerabilities in enterprise platforms. The development team should take note that enterprises will be scanning the application as installed in the enterprise. Gaining an understanding of the footprint and security implications of an application before shipping will help the team to identify potential issues before they are discovered by customers.
Scanners have been developed to search for a variety of specific conditions. There are scanners that can search code bases for patterns that are indicative of elements of
the OWASP Top 10 and the SANS Top 25 lists. There are scanners tuned to produce reports for PCI and Sarbanes Oxley (SOX) compliance. A common mitigation for several regulatory compliance programs is a specific set of scans against a specified set of vulnerabilities.
Attack Surface Analyzer
Microsoft has developed and released a tool called the attack surface analyzer, which is designed to measure the security impact of an application on a Windows environment. Acting as a sophisticated scanner, the tool can detect the changes that occur to the underlying Windows OS when an application is installed. Designed to specifically look for and alert on issues that have been shown to cause security weaknesses, the attack surface analyzer enables a development team or an end user to
• View changes in the Windows attack surface resulting from the installation of the application
• Assess the aggregate attack surface change associated with the application in the enterprise environment
• Evaluate the risk to the platform where the application is proposed to exist
• Provide incident response teams detailed information associated with a Windows platform
One of the advantages of the attack surface analyzer is that it operates independently of the application that is under test. The attack surface analyzer scans the Windows OS environment and provides actionable information on the security implications of an application when installed on a Windows platform. For this reason, it is an ideal scanner for final security testing as part of the secure development lifecycle (SDL) for applications targeted to Windows environments.
Penetration Testing
Penetration testing, sometimes called pen testing, is an active form of examining the system for weaknesses and vulnerabilities. While scanning activities are passive in nature, penetration testing is more active. Vulnerability scanners operate in a sweep, looking for vulnerabilities using limited intelligence; penetration testing harnesses the power of human intellect to make a more targeted examination. Penetration testers attack a system using information gathered from it and expert knowledge in how weaknesses can exist in systems. Penetration testing is designed to mimic the attacker’s ethos and methodology, with the objective of finding issues before an adversary does. It is a highly structured and systematic method of exploring a system and finding and attacking weaknesses.
Penetration testing is a very valuable part of the SDL process. It can dissect a program and determine if the planned mitigations are effective or not. Pen testing can discover vulnerabilities that were not thought of or mitigated by the development team. It can be done with either a white- or black-box testing mode.
Penetration Testing
Penetration testing is a structured test methodology. The following are the basic steps employed in the process:
1.
Reconnaissance (discovery and enumeration)
2.
Attack and exploitation
3.
Removal of evidence
4.
Reporting
The penetration testing process begins with specific objectives being set out for the tester to explore. For software under development, these could be input validation vulnerabilities, configuration vulnerabilities, and vulnerabilities introduced to the host platform during deployment. Based on the objectives, a test plan is created and executed to verify that the software is free of known vulnerabilities. As the testers probe the software, they take notes of the errors and responses, using this information to shape subsequent tests.
Penetration testing is a slow and methodical process, with each step and results being validated. The records of the tests should demonstrate a reproducible situation where the potential vulnerabilities are disclosed. This information can give the development team a clear picture of what was found so that the true root causes can be identified and fixed.
Fuzzing
Fuzz testing is a brute-force method of addressing input validation issues and vulnerabilities. The basis for fuzzing a program is the application of large numbers of inputs to determine which ones cause faults and which ones might be vulnerable to exploitation. Fuzz testing can be applied to anywhere data is exchanged to verify that input validation is being performed properly. Network protocols can be fuzzed, file protocols can be fuzzed, web protocols can be fuzzed. The vast majority of browser errors are found via fuzzing.
Fuzz testing works well in white-, black- or grey-box testing, as it can be independent of the specifics of the application under test. Fuzz testing works by sending a multitude of input signals and seeing how the program handles them. Specifically, malformed inputs can be used to vary parser operation, check for memory leaks, buffer overflows, and a wide range of input validation issues. Since input validation errors are one of the top issues in software vulnerabilities, fuzzing is the best method of testing against these issues, such as cross-site scripting and injection vulnerabilities.
There are several ways to classify fuzz testing. One set of categories is smart and dumb, indicating the type of logic used in creating the input values. Smart testing uses
knowledge of what could go wrong and creates malformed inputs with this knowledge. Dumb testing just uses random inputs. Another set of terms used to describe fuzzers is generation-based and mutation-based.
|
EXAM TIP
Fuzz testing is a staple of SDL-based testing, finding a wide range of errors with a single test method.
|
Generation-based fuzz testing uses the specifications of input streams to determine the data streams that are to be used in testing. Mutation-based fuzzers take known good traffic and mutate it in specific ways to create new input streams for testing. Each of these has its advantages, and the typical fuzzing environment involves both used together.
Simulation Testing
Simulation testing involves testing the application in an environment that mirrors the associated production environment. Examining issues such as configuration issues and how they affect the program outcome is important. Data issues that can result in programmatic instability can also be investigated in the simulated environment.
Setting up an application and startup can be time consuming and expensive. When developing a new application, considering the challenges associated with the instantiation of the system can be important with respect to customer acceptance. Simple applications may have simple setups, but complex applications can have significant setup issues. Simulation testing can go a long way toward discovering issues associated with the instantiation of an application and its operation in the production environment.
Simulation testing can provide that last testing line of defense to ensure the system is properly functioning prior to deployment. This is an opportunity to verify that the interface with the OS is correct and that roles are properly configured to support access and authorization. It also checks that firewall rules (or other enforcement points) between tiers/environments are properly documented, configured, and tested to ensure that attack surface/exposure is managed. Other benefits of simulation testing include validating that the system itself can stand up to the rigors of production performance—for example, using load testing to “beat up” the application to ensure availability is sustainable and that the controls don’t “break” when the load reaches a particular threshold.
Testing for Failure
Not all errors in code result in failure. Not all vulnerabilities are exploitable. During the testing cycle, it is important to identify errors and defects, even those that do not cause a failure. Although a specific error, say one in dead code that is never executed, may not cause a failure in the current version, this same error may become active in a later version and result in a failure. Leaving an error such as this alone or leaving it for future regression testing is a practice that can cause errors to get into production code
.
Although most testing is for failure, it is equally important to test for conditions that result in incorrect values, even if they do not result in failure. Incorrect values have resulted in the loss of more than one spacecraft in flight; even though the failure did not cause the program to fail, it did result in system failure. A common failure condition is load testing, where the software is tested for capacity issues. Understanding how the software functions under heavy load conditions can reveal memory issues and other scale-related issues. These elements can cause failure in the field, and thus extensive testing for these types of known software issues is best conducted early in the development process where issues can be addressed prior to release.
Cryptographic Validation
Having secure cryptography is easy: Use approved algorithms and implement them correctly and securely. The former is relatively easy—pick the algorithm from a list. The latter is significantly more difficult. Protecting the keys, the seed values, and ensuring proper operational conditions are met has proven to be challenging in many cases. Other cryptographic issues include proper random number generation and key transmission.
Cryptographic errors come from several common causes. One typical mistake is choosing to develop your own cryptographic algorithm. Developing a secure cryptographic algorithm is far from an easy task, and even when done by experts, weaknesses can occur that make them unusable. Cryptographic algorithms become trusted after years of scrutiny and attacks, and any new algorithms would take years to join the trusted set. If you instead decide to rest on secrecy, be warned that secret or proprietary algorithms have never provided the desired level of protection. One of the axioms of cryptography is that security through obscurity has never worked in the long run.
Deciding to use a trusted algorithm is a proper start, but there still are several major errors that can occur. The first is an error in instantiating the algorithm. An easy way to avoid this type of error is to use a library function that has already been properly tested. Sources of these library functions abound and provide an economical solution to this functionality’s needs. Given an algorithm, and a proper instantiation, the next item needed is the random number to generate a random key.
The generation of a real random number is not a trivial task. Computers are machines that are renowned for reproducing the same output when given the same input, so generating a string of pure, nonreproducible random numbers is a challenge. There are functions for producing random numbers built into the libraries of most programming languages, but these are pseudo-random number generators, and although the distribution of output numbers appears random, it generates a reproducible sequence. Given the same input, a second run of the function will produce the same sequence of “random” numbers. Determining the seed and random sequence and using this knowledge to “break” a cryptographic function has been used more than once to bypass the security. This method was used to subvert an early version of Netscape’s Secure Sockets Layer (SSL) implementation. An error in the Debian instantiation of OpenSSL resulted in poor seed generation, which then resulted in a small set of random values.
|
EXAM TIP
Cryptographically random numbers are essential in cryptosystems and are best produced through cryptographic libraries.
|
Using a number that is cryptographically random and suitable for an encryption function resolves the random seed problem, and again, the use of trusted library functions designed and tested for generating such numbers is the proper methodology. Trusted cryptographic libraries typically include a cryptographic random number generator.
Poor key management has failed many a cryptographic implementation. A famous exploit where cryptographic keys were obtained from an executable and used to break a cryptographic scheme involved hackers using this technique to break DVD encryption and develop the DeCSS program. Tools have been developed that can search code for “random” keys and extract them from the code or running process. The bottom line is simple: Do not hard-code secret keys in your code. They can, and will, be discovered. Keys should be generated and then passed by reference, minimizing the travel of copies across a network or application. Storing them in memory in a noncontiguous fashion is also important to prevent external detection.
FIPS 140-2
FIPS 140-2 is a prescribed standard, part of the Federal Information Processing Standards series that relates to the implementation of cryptographic functions. FIPS 140-2 deals with issues such as the selection of approved algorithms, such as AES, RSA, and DSA. FIPS 140-2 also deals with the environment where the cryptographic functions are used, as well as the means of implementation.
|
EXAM TIP
FIPS 140-2 specifies requirements, specifications, and testing of cryptographic systems for the U.S. federal government.
|
Regression Testing
Software is a product that continually changes and improves over time. Multiple versions of software can have different and recurring vulnerabilities. Anytime that software is changed, whether by configuration, patching, or new modules, the software needs to be tested to ensure that the changes have not had an adverse impact on other aspects of the software. Regression testing is a minor element early in a product’s lifecycle, but as a product gets older and has advanced through multiple versions, including multiple customizations, etc., the variance between versions can make regression testing a slow, painful process.
Regression testing is one of the most time-consuming issues associated with patches for software. Patches may not take long to create—in fact, in some cases, the party discovering the issue may provide guidance on how to patch. But before this solution can be trusted across multiple versions of the software, regression testing needs to occur. When software is “fixed,” several things can happen. First, the fix may cause a fault in some other part of the software. Second, the fix may undo some other
mitigation at the point of the fix. Third, the fix may repair a special case, entering a letter instead of a number, but miss the general case of entering any non-numeric value. The list of potential issues can go on, but the point is that when a change is made, the stability of the software must be checked.
Regression testing is not as simple as completely retesting everything—this would be too costly and inefficient. Depending upon the scope and nature of the change, an appropriate regression test plan needs to be crafted. Simple changes to a unit may only require a level of testing be applied to the unit, making regression testing fairly simple. In other cases, regression testing can have a far-reaching impact across multiple modules and use cases. A key aspect of the patching process is determining the correct level, breadth, and scope of regression testing that is required to cover the patch.
Specialized reports, such as delta analysis and historical trending reports, can assist in regression testing efforts. These reports are canned types and are present in a variety of application security test tools. When leveraging regular scan and reporting cycles, remediation meetings using these reports to enable the security tester to analyze and work with teams to fix the vulnerabilities associated with each release—release 1 vs. release 2, or even over the application’s release lifetime (compare release 1 to 2 to 3 and so on).
Impact Assessment and Corrective Action
Bugs found during software development are scored based on impact. During the course of development, numerous bugs are recorded in the bug tracking system. As part of the bug clearing or corrective action process, a prioritization step determines which bugs get fixed and when. Not all bugs are exploitable, and among those that are exploitable, some have a greater impact on the system. In an ideal world, all bugs would be resolved at every stage of the development process. In the real world, however, some errors are too hard (or expensive) to fix and the risk associated with them does not support the level of effort required to fix them in the current development cycle. If a bug required a major redesign, then the cost could be high. If this bug is critical to the success or failure of the system, then resolving it becomes necessary. If it is inconsequential, then resolution may be postponed until the next major update and redesign opportunity.
Chapter Review
In this chapter, different types of security tests were presented. Scanning was presented as a means of characterizing and identifying vulnerabilities. While scanning tends to be broad in scope, the next technique, penetration testing, tends to be very specific in its methods of finding vulnerabilities. The next method, fuzzing, is specific in its target, but very general in its method of testing, finding a wide range of problems. Simulation testing is where the application is tested in a simulated production environment to find operational errors.
Testing for failures is important, but so is testing for errors that cause incorrect values but not failure. Cryptographic systems can be complex and difficult to implement properly. Testing the areas of failure associated with cryptographic systems was covered.
Testing various versions of software is referred to as regression testing. The chapter closed by examining the impact of a bug and how this is used in prioritizing corrective actions.
Quick Tips
• Scanning is automated enumeration of specific characteristics of an application or network.
• Penetration testing is an active form of examining the system for weaknesses and vulnerabilities.
• Fuzz testing is a brute-force method of addressing input validation issues and vulnerabilities.
• Simulation testing involves testing the application in an environment that mirrors the associated production environment.
• Although most testing is for failure, it is equally important to test for conditions that result in incorrect values, even if they do not result in failure.
• Only approved cryptographic algorithms should be used; creating your own cryptography is a bad practice.
• Testing various versions of software is referred to as regression testing.
• Bugs are measured in terms of their impact on the system, and this impact can be used to prioritize corrective action efforts.
Questions
To further help you prepare for the CSSLP exam, and to provide you with a feel for your level of preparedness, answer the following questions and then check your answers against the list of correct answers found at the end of the chapter.
1
.
Which criteria should be used to determine the priority for corrective action of a bug in an application?
A.
Ease of correction effort (fix easy ones fast)
B.
Cost to repair bug
C.
Potential system impact of the bug
D.
Size of the bug in line of code affected
2
.
Testing different versions of an application to verify patches don’t break something is referred to as:
A.
Penetration testing
B.
Simulation testing
C.
Fuzz testing
D.
Regression testing
3
.
Testing an application in an environment that mirrors the production environment is referred to as:
A.
Simulation testing
B.
Fuzz testing
C.
Scanning
D.
Penetration testing
4
.
Verification of cryptographic function includes all of the following except:
A.
Use of secret encryption methods
B.
Key distribution
C.
Cryptographic algorithm
D.
Proper random number generation
5
.
An automated enumeration of specific characteristics of an application or network is referred to as:
A.
Penetration testing
B.
Scanning
C.
Fuzz testing
D.
Simulation testing
6
.
To examine a system for input validation errors, the most comprehensive test is:
A.
Scanning
B.
Penetration testing
C.
Regression testing
D.
Fuzz testing
7
.
Fuzz testing data can be characterized by:
A.
Mutation or generation
B.
Input validation, file, or network parsing
C.
Size and character set
D.
Type of fault being tested for
8
.
Penetration testing includes the following steps:
A.
Reconnaissance, testing, reporting
B.
Reconnaissance, exploitation, recovery
C.
Attacking, testing, recovery
D.
Reconnaissance, attacking, removal of evidence, reporting
9
.
Characterizing an application in the environment it is designed to operate in can be done with:
A.
Scanning
B.
Failure testing
C.
Simulation testing
D.
Pen testing
10
.
FIPS 140-2 is a federal standard associated with:
A.
Software quality assurance
B.
Security testing
C.
Cryptographic implementation in software
D.
Software security standards
11
.
One of the challenges associated with regression testing is:
A.
Determining an appropriate test portfolio
B.
Determining when to employ it
C.
The skill level needed to perform it
D.
Designing an operational platform for testing
12
.
To test a system for problems such as configuration errors or data issues, one would use:
A.
Scanning
B.
Configuration testing
C.
Testing for failure
D.
Simulation testing
13
.
Regression testing is employed across:
A.
Different modules based on fuzz testing results
B.
Different versions after patching has been employed
C.
Cryptographic modules to test stability
D.
The production environment
14
.
The following are causes of cryptographic failure except:
A.
Choice of algorithm
B.
Random number generation
C.
Key length
D.
Secret data management
15
.
Sending multitudes of inputs to a program during testing is called:
A.
Regression testing
B.
Scanning
C.
Simulation testing
D.
Fuzz testing
Answers
1
. C.
The potential impact is the metric used to prioritize bug fixes.
2
. D.
Regression testing is used to ensure patches don’t break different versions of an application.
3
. A.
Simulation testing involves mimicking the production environment.
4
. A.
Secret encryption methods are not part of cryptographic validation.
5
. B.
Scanning can be used to automate the enumeration of system elements.
6
. D.
Fuzz testing is used to test for input validation errors, memory leaks, and buffer overflows.
7
. A.
Fuzz testing datasets are built by either mutation or generation methods.
8
. D.
The steps for penetration testing are reconnaissance, attacking, removal of evidence, and reporting.
9
. A.
Scanning can be used to characterize an application in a production environment.
10
. C.
FIPS 140-2 is a cryptographic standard associated with the U.S. government.
11
. A.
Determining the scope and type of tests required based on a patch is one of the challenges of regression testing.
12
. D.
Simulation testing is used to verify configuration and data issues in the production environment.
13
. B.
Regression testing is performed after patching and occurs across different versions of an application.
14
. C.
Key length is linked to algorithm choice, and is not a common failure mode.
15
. D.
Fuzz testing involves sending large quantities of inputs to a system and seeing how they respond.