Hack #116. Spellcheck All Your Listings

Implement passive, configurable spellchecking to create correctly-spelled listings in less time.

The success of any auction is largely due to how readily it can be found in eBay searches. As described in Chapter 2, eBay searches show only exact matches (with very few exceptions), which means, among other things, that spelling most definitely counts.

Turbo Lister and eBay's Sell Your Item form have spellcheck features, both of which use the old-school, manual approach that forces you to interrupt your work to review each individual mistake. This hack streamlines the process by summarizing the spelling errors in all your listings in one place.

The following script requires the following modules and programs:

Here's the script:

         #!/usr/bin/perl
         require 'ebay.pl';

         require HTML::TreeBuilder;
         require HTML::FormatText;
         use Lingua::Ispell qw( spellcheck );
         Lingua::Ispell::allow_compounds(1);

         $out1 = "";
         $outall = "";
         $numchecked = 0;
         $numfound = 0;

         $today = &formatdate(time);
         $yesterday = &formatdate(time - 86400);

         my $page_number = 1;
         PAGE:
         while (1) {
            my $rsp = call_api({ Verb => 'GetSellerList',                           DetailLevel => 0,
                              UserId => $user_id,
                       StartTimeFrom => $yesterday,
                         StartTimeTo => $today,
                          PageNumber => $page_number
         });

         if ($rsp->{Errors}) {
           print_error($rsp);
           last PAGE;
         }
         foreach (@{$rsp->{SellerList}{Item}}) {
           my %i = %$_;
           $id = @i{qw/Id/};

           if (! -e "$localdir/$id") { 
             my $rsp = call_api({ Verb => 'GetItem', 
                           DetailLevel => 2, 
                                    Id => $id 
             });

             if ($rsp->{Errors}) {
                print_error($rsp)
             } else { 
               my %i = %{$rsp->{Item}[0]}; 
               my ($title, $description) = @i{qw/Title Descri

ption/};
              $spellthis = $title . " " . $description;  
              $tree = HTML::TreeBuilder->new_from_content($spellthis);          
               $formatter = HTML::FormatText->new(); 
               $spellthat = $formatter->format($tree);  
              $tree = $tree->delete;
              
               for my $r ( spellcheck( $spellthat ) ) {
                 if ( $r->{'type'} eq 'miss' ) { 
                 $out1 = $out1."'$r->{'term'}'"; 
                 $out1 = $out1." - near misses: @{$r->{'misses'}}\n"; 
                 $numfound++;
             }
             elsif ( $r->{'type'} eq 'guess' ) { 
               $out1 = $out1."'$r->{'term'}'"; 
               $out1 = $out1." - guesses: @{$r->{'guesses'}}\n"; 
               $numfound++;
             }
             elsif ( $r->{'type'} eq 'none' ) { 
               $out1 = $out1."'$r->{'term'}'"; 
               $out1 = $out1." - no match.\n"; 
               $numfound++;
             }
            }

          $numchecked++;
          if ($out1 ne "") {
          $outall = $outall."Errors in #$id '$title':\n";
          $outall = $outall."$out1\n\n";
          $out1 = "";
         }

         }
        }
       }
       last PAGE unless $rsp->{SellerList}{HasMoreItems};
       $page_number++;
  }

  print "$numfound spelling errors found in $numchecked auctions:\n\n"; 
  print "$outall\n";

This script is based on the one in "Automatically Keep Track of Items You've Sold" [Hack #112] , but it has a few important additions and changes.

First, instead of listing recently completed auctions, the GetSellerList API call is used to retrieve auctions that have started in the last 24 hours. This will work perfectly if you want to review your listings daily or schedule [Hack #21] it to run every 24 hours, say, at 3:00 P.M. every day.

Second, since you want the auction descriptions, you need to use the GetItem API call for each auction we spellcheck. This means that spellchecking a dozen auctions will require 13 API calls: one call to retrieve the list, and one for each auction.

The code actually responsible for performing spellcheck starts on line , where the title and description are concatenated into a single variable, $spellthis, so that only one spellcheck is necessary for each auction. Next, the HTML::FormatText module is used (lines to ) to convert any HTML-formatted text to plain text.

Finally, the Lingua::Ispell module uses the external ispell program to perform a spellcheck on $spellthat (the cleaned-up version of $spellthis). As errors are found, suggestions are recorded into the $out1 variable, which is merged with $outall and displayed when the spellcheck is complete.

Here are a few things you might want to do with this script: