Our query images will come from a web search. Before we start implementing the search functionality, let's write some helper functions, which let us fetch images via the Requests library and convert them to an OpenCV-compatible format. Because this functionality is highly reusable, we will put it in a module of static utility functions. Let's create a file called RequestsUtils.py
and import OpenCV, NumPy, and Requests, as follows:
import numpy import cv2 import requests import sys
As a global variable, let's store HEADERS
, a dictionary of headers that we will use while making web requests. Some servers reject requests that appear to come from a bot. To improve the chance of our requests being accepted, let's set the 'User-Agent'
header to a value that mimics a web browser, as follows:
# Spoof a browser's User-Agent string. # Otherwise, some sites will reject us as a bot. HEADERS = { 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; ' + \ 'rv:25.0) Gecko/20100101 Firefox/25.0' }
Whenever we receive a response to a web request, we want to check whether the status code is 200 OK
. This is only a cursory test of whether the response is valid, but it is a good enough test for our purposes. We will implement this test in the following method, validateResponse
, which returns True
if the response is deemed valid; otherwise it logs an error message and returns False
:
def validateResponse(response): statusCode = response.status_code if statusCode == 200: return True url = response.request.url print >> sys.stderr, \ 'Received unexpected status code (%d) when requesting %s' % \ (statusCode, url) return False
With the help of HEADERS
and validateResponse
, we can try to get an image from a URL and return that image in an OpenCV-compatible format (failing that, we will return None
.) As an intermediate step, we will read raw data from a web response into a NumPy array using a function called numpy.fromstring
. We will then interpret this data as an image using a function called cv2.imdecode
. Here is our implementation of a function called cvImageFromUrl
that accepts a URL as an argument:
def cvImageFromUrl(url): response = requests.get(url, headers=HEADERS) if not validateResponse(response): return None imageData = numpy.fromstring(response.content, numpy.uint8) image = cv2.imdecode(imageData, cv2.CV_LOAD_IMAGE_COLOR) if image is None: print >> sys.stderr, \ 'Failed to decode image from content of %s' % url return image
To test these two functions, let's give RequestsUtils.py
a main
function that downloads an image from the web, converts it to an OpenCV-compatible format, and writes it to the disk using an OpenCV function called imwrite
. This is covered in the following implementation:
def main(): image = \ cvImageFromUrl('http://nummist.com/images/ceiling.gaze.jpg') if image is not None: cv2.imwrite('image.png', image) if __name__ == '__main__': main()
To confirm that everything worked, open image.png
(which should be in the same directory as RequestsUtils.py
) and compare it to the online image, which you can view in a web browser at http://nummist.com/images/ceiling.gaze.jpg.
Although we are putting a simple test of our RequestUtils
module in a main
function, a more sophisticated and maintainable approach to write tests in Python is to use the classes in the unittest
module of the standard library. Refer to the official tutorial here for more information: https://docs.python.org/2/library/unittest.html.