Searching with Signatures

Most signatures can be represented as regular expressions that can be used for searching through mail files or directories of web pages. The Unix grep command can be used to scan both of these, as long as care is taken to escape any characters that have special meaning to this command. It is a very efficient way to identify files that contain a match and can report the lines and line numbers where the matches are found. But in the case of email files, what you really need is a way to extract the individual messages that match, and grep cannot do this for you.

Most email client programs allow you to search the content of messages, but these can be laborious to use and may not offer the flexibility that you need. The Perl script shown in Example 10-1 will step through each message in a mail file, in standard MBOX format , and output those that contain one or more matches to a user-specified pattern.

Example 10-1. extract_match_string.pl

#!/usr/bin/perl -w
if(@ARGV == 0 or @ARGV > 2) {
   die "Usage: $0 <pattern> [<mail file>]\n";
} elsif(@ARGV == 1) {
   $ARGV[1] = '-';
}
my $pattern = $ARGV[0];
my $flag = 0;
my $separator = 0;
my $text = '';

open INPUT, "< $ARGV[1]" or die "$0: Unable to open file $ARGV[1]\n";
while(<INPUT>) {
    if(/^From\s.*200\d$/ and $separator == 1) {
        $separator = 0;
        if($flag) { # print previous message if it matched
           print $text;
           $flag = 0;
        }
        $text = '';
    } elsif(/^\s*$/) {
        $separator = 1;
    } else {
        $separator = 0;
        if(/$pattern/) {
           $flag++;
        }
    }
    $text .= $_;
}
if($flag) {
   print $text;
}
close INPUT;

The output is itself a mail file that can be processed further. By examining the output messages, you can confirm that the signature pattern really is specific for this type of message. Depending on the results, you might want to refine the signature and repeat the process.

For example, I chose the string MfcISAPICommand, taken from a URL contained in an email message for a fake Washington Mutual bank site. Running the script with this pattern on my Junk mail folder yielded six messages that matched. Looking at the URLs contained therein, it was clear that this signature linked three Washington Mutual phishing attempts with another three that had eBay as their target. This simple search has resulted in two apparently different scams being linked:

    http://200.93.65.167/.Wamu/index.php?MfcISAPICommand=SignInFPP
    &UsingSSL=1&email=&userid=
    http://210.3.2.101/.wamusk/index.php?MfcISAPICommand=SignInFPP
    &UsingSSL=1&email=&userid=
    http://200.207.131.33:81/mutualsk/index.php?MfcISAPICommand=SignInFPP
    &UsingSSL=1&email=&userid=

    http://213.22.143.6/.eBay/eBayISAPI.php?MfcISAPICommand=SignInFPP
    &UsingSSL=1&email=&userid=
    http://61.211.238.165/aw-cgi/eBayISAPI.php?MfcISAPICommand=SignInFPP
    &UsingSSL=1&email=&userid=
    http://ns.zonasiete.org/.eBay/eBayISAPI.php?MfcISAPICommand=SignInFPP
    &UsingSSL=1&email=&userid=