urllib2
provides an extended, extensible interface to web resources. urllib2
's application-level interface is
essentially identical to urllib's urlopen( )
function (Section
42.5). Underneath, however, urllib2
explicitly supports proxies, caching, basic and digest authentication, and so
forth.
urllib2
uses an Opener
, made up of a series of Handlers
, to open a URL; if you know you want to use a particular
set of features, you tell urllib2
which
Handlers
to use before you call urlopen( )
. urllib2
is extensible largely because if you need to deal with
some odd set of interactions, you can write a Handler
object to deal with just those interactions and
incorporate it into an Opener
with existing
Handlers
. This allows you to deal with
complex behavior by just combining very simple sets of code.
For example, to retrieve a web resource that requires basic authentication over a secure socket connection:
>>>import urllib2
>>>authHandler = urllib2.HTTPBasicAuthHandler( )
>>>authHandler.add_password("private, "https://www.domain.com/private",
..."user", "password")
>>>opener = urllib2.build_opener(authHandler)
>>>urllib2.install_opener(opener)
>>>resource = urllib2.urlopen("https://www.domain.com/private/foo.html")
>>>print resource.read( )
To implement a new Handler
, you simply
subclass from urllib2.BaseHandler
and
implement the methods appropriate to the behavior you want to handle.
— DJPH