The crawler

The crawler component is effectively the heart of the search engine. It operates on a set of links that are either seeded into the system or discovered while crawling a previous set of links. As you can see in the preceding component model diagram, the crawler itself is in fact a package that encapsulates several other sub-components that operate in a pipeline-like kind of configuration. Let's examine the role of each one of those sub-components in a bit more detail.