The payload used by this spider is an extension of the library used in Chapter 8 to download all the images found on a web page. This time, however, we’ll download all the images referenced by the entire website. The code that adds the payload to the spider is shown in Example 17-8. You can tack this code directly onto the end of the script for the earlier spider.
Example 17-8. Adding a payload to the simple spider
# Add the payload to the simple spider // Include download and directory creation lib include("LIB_download_images.php"); // Download images from pages referenced in $spider_array for($penetration_level=1; $penetration_level<=$MAX_PENETRATION; $penetration_level++) { for($xx=0; $xx<count($spider_array[$previous_level]); $xx++) { download_images_for_page($spider_array[$previous_level][$xx]); } }
Functionally, the addition of the payload involves the inclusion of the image download library and a two-part loop that activates the image harvester for every web page referenced at every penetration level.