Any new data can be passed to the model to get the results. This process of getting the classification results or features from an image is termed as inference. Training and inference usually happen on different computers and at different times. We will learn about storing the model, running the inference, and using TensorFlow Serving as the server with good latency and throughput.