Identify a widely used dataset, such as ImageNet, MNIST, or LFW. Research its content and genesis. Who made it, when, and how? Who uses it, and for what? (What biases might this dataset have?) Describe your findings in one or two paragraphs.
Use two different image analysis or classification tools to interpret the same image. Write a paragraph that describes and compares their results.
Use an object recognition library or a classifier to store a list of all the objects it sees. Use it to make a shopping list of everything in your fridge.1
Combine an object recognition classifier with a text-to-speech library. Write a computer program that narrates what it sees through the webcam.
Touching your face can spread disease. Train a webcam classifier to detect when you touch your face. Write a program that sounds an alarm if you do. (Image: Isaac Blankensmith's ANTI-FACE-TOUCHING MACHINE™, an influential implementation of this concept.)2
Using an image classifier and your computer's camera, train a system that detects your facial expressions and displays corresponding emojis.3
Train an image classifier to determine whether you have raised your left or right hand. Using a “webdriver” (also called a mouse/keyboard automator), write a program in which your classifier controls a classic arcade game, such as Space Invaders, by spoofing presses of the WASD or arrow keys. Examples of webdrivers include the Java Robot class, JavascriptExecutor, and Selenium Browser Automation Project.
Use a pose classifier or face tracker to create a program that lets you draw with your nose.4
Train a webcam regressor to produce a number between zero and one, according to the closing and opening of your hand. (Your regressor could use hand pose data from a tracking library, or it could process the camera's pixels directly.)5 Use this number to puppeteer the mouth of a simple cartoon face.
Collect some audio recordings of the ambient sound in your room in the morning, at midday, in the evening, and at night. Train a classifier or regressor with these sounds. Use this system to display an approximate estimate of the time.
Collect the last ten text messages that you sent. Use a sentiment analysis tool to assess the mood of each message.6
Create or download a collection of about a thousand images representing a narrow category of subject matter (cats, flowers, yearbook photos). Using a Generative Adversarial Network (GAN), synthesize new images that appear to belong to this dataset.7 (Image: “This Foot Does Not Exist” by the MSCHF collective.)
Clustering a Collection of Images
Generate a 2D map that reveals similarities between images in a set. Begin with an image dataset that interests you. Using some sort of image analysis library, such as a convolutional neural network, calculate high-dimensional numeric descriptions for each image. Simplify these description vectors to just two dimensions using a dimensionality reduction algorithm, such as UMAP or t-SNE. Plot your images in the (x,y) locations produced by this algorithm. Discuss the clusters you observe. (Image: a UMAP plot by Christopher Pietsch showing the OpenMoji emoji collection.)