Adding the Payload

The payload used by this spider is an extension of the library used in Chapter 8 to download all the images found on a web page. This time, however, we’ll download all the images referenced by the entire website. The code that adds the payload to the spider is shown in Example 17-8. You can tack this code directly onto the end of the script for the earlier spider.

Example 17-8. Adding a payload to the simple spider

# Add the payload to the simple spider
// Include download and directory creation lib
include("LIB_download_images.php");

// Download images from pages referenced in $spider_array
for($penetration_level=1; $penetration_level<=$MAX_PENETRATION; $penetration_level++)
    {
    for($xx=0; $xx<count($spider_array[$previous_level]); $xx++)
        {
        download_images_for_page($spider_array[$previous_level][$xx]);
        }
    }

Functionally, the addition of the payload involves the inclusion of the image download library and a two-part loop that activates the image harvester for every web page referenced at every penetration level.