A lot of developers will hear about machine learning, deep learning, or neural networks at some point in their career. You may have already heard about these topics. If you have, you know that machine learning is a complex field that requires particular domain knowledge. However, machine learning is becoming more prominent and popular by the day, and it is used to improve many different types of applications.
For instance, machine learning can be used to predict what type of content a particular user might like to see in a music app, based on music that they already have in their library, or to automatically tag faces in photos to connect them to people in the user's contact list. It can even be used to predict costs for specific products or services based on past data. While this might sound like magic, the flow for creating machine learning experiences like these can be split roughly into two phases:
- Training a model
- Using inference to obtain a result from the model
Large amounts of high-quality data must be collected to perform the first step. If you're going to train a model that should recognize cats, you will need large amounts of pictures of cats. You must also collect images that do not contain cats. Each image must then be appropriately tagged to indicate whether the image includes a cat or not.
If your dataset only contains images of cats that face towards the camera, the chances are that your model will not be able to recognize cats from a sideways point of view. If your dataset does contain cats from many different sides, but you only collected images for a single breed or with a solid white background, your model might still have a tough time recognizing all cats. Obtaining quality training data is not easy, yet it's essential.
During the training phase of a model, it is imperative that you provide a set of inputs that are of the highest quality possible. The smallest mistake could render your entire dataset worthless. It's in part due to the process of collecting data that training a model is a tedious task. One more reason is that training a model typically takes a lot of time. Certain complex models could take a couple of hours to crunch all the data and train themselves.
A trained model comes in several types. Each type of model is suitable for a different kind of task. For instance, if you are working on a model that can classify certain email messages as spam, your model might be a so-called support vector machine. If you're training a model that recognizes cats in pictures, you are likely training a neural network.
Each model comes with its own pros and cons, and each model is created and used differently. Understanding all these different models, their implications, and how to train them is extremely hard, and you could likely write a book on each kind of model.
In part, this is why CoreML is so great. CoreML enables you to make use of pre-trained models in your own apps. On top of this, CoreML standardizes the interface that you use in your own code. This means that you can use complex models without even realizing it. Let's learn more about CoreML, shall we?