In the previous chapters, I introduced the main tools and techniques of Internet forensics that you will use all the time in your own explorations. But I am a firm believer that you can never have too many tools, so this chapter presents a miscellany of techniques that you may want to keep on hand for that special occasion.
These are the one-of-a-kind tools that, in the real world, you would find rattling around in the bottom of your toolbox among the orphaned nuts and bolts and the blunt drill bits. They are the sort of thing that you don’t need very often, but when the occasion arises, they are just right for the job.
Knowing where in the world someone is located is very valuable
information. In Chapter
2, I talked about how you can infer the location of a computer
from its IP address and the whois
record for its domain name. I also explained how many of those records
contain bogus contact information that is placed there to
deceive.
To recap those points, you can use the whois
command with an IP address to find out
the network block that contains a specific machine. This should specify
the country and may be able to define the region or even the city in
which it is located. Using dig -x
on
the IP address may return a different hostname than you started with,
especially if it hosts multiple web servers. The canonical name that DNS
returns for the host may contain clues about its location.
If the host lies within a country specific domain, such as .uk or .fr, then you can tell right away with which country the server is associated. But be aware that smaller countries with interesting domain suffixes will sell domains to anyone, allowing them to locate those servers anywhere in the world. One example is the island of Tonga in the South Pacific, which manages domains with the .to suffix.
Inferring location at a higher level of resolution requires a certain amount of local knowledge. Take these three hostnames, for example:
In the first of these, working right to left, you can infer that this host in the United States from the us component of the name. OH is the abbreviation for Ohio, so the oh part suggests a location in that state and columbus1 implies that this machine is located in the city of Columbus. The other two examples require a little more intuition. Placing ameritech in the United States is pretty obvious and can be easily verified. chcgil and sbndin might be hard to decode if you had a single example only. But having the pair helps reveal that chcgil means Chicago, Illinois, and sbndin means South Bend, Indiana.
But bear in mind that reverse DNS lookups will only work if the machine has been given a hostname and a reverse mapping record has been added to the DNS tables. The reason many fraud-related sites use a numeric IP address in their URLs is to make it difficult for anyone to locate their server.
Even if the name of a computer is uninformative, you may be able
to infer location from the names of the routers that link your system to
theirs. traceroute
will list those
names as it builds its path. These routers are often given informative
names by the companies that operate network backbones because this can
help them debug system problems. The following block of traceroute
output shows the path from my system in Seattle to one in
Chicago:
% traceroute 68.251.56.245
[...]
11 ex1-p9-0.eqsjca.sbcglobal.net (151.164.191.201)
bb1-p6-0.crsfca.sbcglobal.net (151.164.41.9)
core1-p5-0.crsfca.sbcglobal.net (151.164.243.1)
14 core1-p5-0.crskut.sbcglobal.net (151.164.42.11)
15 core1-p2-0.crdnco.sbcglobal.net (151.164.243.242)
16 core1-p5-0.crkcmo.sbcglobal.net (151.164.42.23)
17 core2-p5-0.crchil.sbcglobal.net (151.164.191.199)
It is easy to figure out the naming convention of these routers if
you know a little about U.S. geography. They reveal that packets are
being routed through San Jose (sj
) in
California, then through San Francisco (sf
), Salt Lake City (sk
) in Utah, Denver (dn
) in Colorado, Kansas City (kc
) in Missouri, and finally to Chicago
(ch
) in Illinois. Now, if I had to
analyze a route within, say, Japan or China, then I might not be quite
as successful. But the technique can be useful in a surprising number of
cases.
In principle, the address information contained in a whois
record for a domain should be accurate.
But the registries do not validate contact information and you can
presume that domains that are associated with any form of scam will
contain false information. But in other cases, you might want to check a
block of contact information rather than disregarding it on
principle.
There are three scenarios. The address could be correct and completely legitimate. It could be a real address that belongs to some random person with no connection to this domain. Or it could be a completely fictitious address. You can use the resources of the Web to help prove or disprove the third of these options.
For example, here is part of the listing for a domain used to host a fake eBay site. It looks like a legitimate address:
Admin Name........ Robert R. Admin Address..... 1410 S. 12th St Admin Address..... Philadelphia Admin Address..... 19147 Admin Address..... PA Admin Address..... UNITED STATES Admin Phone....... +1.609892xxxx
Entering that into any of the major mapping web sites, such as MapQuest (http://www.mapquest.com), shows that this is a real address. Searching for the person’s name in this city using http://people.yahoo.com, or a similar service, shows that a person by this name does live in this city. I have chosen to truncate the last name to protect their privacy. The address returned does not match the one in the record, but looking through Google search results, I can see a clear connection between this person and this very specific neighborhood of Philadelphia. Perhaps the person used to live at the address in this record.
The piece of data that does not fit properly here is the phone number. The first three digits in a U.S. phone number, after the 1, define the area code. There are plenty of online directories of area codes that will give you the approximate location of that number. In this case, 609 maps to Trenton in the southern part of New Jersey. Trenton is very close to Philadelphia but is a distinct city in a different state. This is not a phone number for a traditional land line in Philadelphia. So that looks wrong. The fact that some people use mobile phones exclusively, coupled with the emergence of Internet telephony, means that it is becoming harder to rely on area codes as a measure of location. But these exceptions are still the minority.
Given the importance of location in commerce, government, and so on, you would think that some enterprising company such as Yahoo! or Google would have built a database that maps IP addresses to cities or regions. Such databases have been built by several companies and research groups, but none of these seem to be very good. The problem is that there is no automated way to generate the location from the IP address. Several efforts have encouraged individuals to register their IP address and physical location in a database, but the amount of data submitted has been disappointing, such that none of the services that I have tried produces an accurate result.