Much of the literature on artificial intelligence (AI), machine learning, and deep learning is full of horrifyingly complicated-looking mathematics and dense academic jargon. It’s a bit scary to dive into if you’re a relatively solid software engineering type who wants to get their feet wet in the world of AI and implement some AI-driven features in their apps.
Don’t worry. AI is easy now (well, easier), and we’re here to show you how to implement it in your Swift apps.
This book introduce concepts and processes to implement AI in various practical ways with Swift.
Before we get into all that, some groundwork is required: primarily, this book is about AI and Swift (hopefully you saw the large friendly letters on the front cover). Swift is an amazing programming language, and is all you need to build iOS, macOS, tvOS, watchOS, or even web apps.
There is a plot twist, though the world of AI has been in a symbiotic relationship with another language for a very long time—Python. So, although this book focuses on Swift, it is not the only programming language that we use in it. We wanted to get that out of the way up front: we do use Python, sometimes.
It’s impossible to avoid Python when doing machine learning and artificial intelligence. The meat of the book is in Swift, though, and we explain why whenever we’re using Python. We’re going to remind you about this a few times.
Something that we need to establish right up front (as we said in the preface) is that we expect you to know how to program already.
This book isn’t for totally new programmers to get into programming with Swift, and learn to do AI things. We’ve designed this book for people who know how to program. You don’t need to be an expert programmer, but you need to kind of know what’s going on.
Even though the book is called Practical Artificial Intelligence with Swift, it’s impossible to create effective AI features without understanding some of the theoretical underpinnings of what you’re working on. This means that we do go into theory sometimes. It’s useful, we promise, and we’ve mostly restricted it to Part III of the book. You’ll want to work through Part II first, though, to get the nuts and bolts of practical implementation.
As the title states, we do indeed perform most of the work in this book using Swift. But swift isn’t the only way to do machine learning. Python is very popular in the machine-learning community for training models. Lua has also seen a fair amount of use. Overall, the machine-learning tools you choose will guide the language you pick. CoreML and CreateML (more on those later) are the main ways we perform our machine learning, and they both assume Swift.
We are making the assumption here that you are mostly comfortable with programming in Swift and creating basic iOS applications. As AI and machine learning is such a big topic, if we were to also cover Swift and iOS, the book would be so big it would look ridiculous on your shelf. As we go through the book, we still explain what’s going on at each step, so don’t worry that we are going to be throwing you into the deep end.
If you are still feeling your way through Swift and iOS, feel you need a refresher, or are just in the mood for another Swift book, check out Learning Swift, in which we go through Swift and iOS starting from nothing to building an entire app.
We’re pretty biased but we think its a really good book. But don’t take our word for it, read it and see.
Coming from a practical perspective means that we’re trying to be practical (and pragmatic) in all things, including choice of language. If Python can do the job, we show you how to use that, as well.
Practical in the context of this book means: we get down to getting the feature that is driven by the AI implemented, and we don’t care about how it works, why it works, or what makes it work. We just care about making it work. There are enough books on AI, machine learning, deep learning theory, and how things work; we don’t need to write another one for you.
As you probably know, Swift tends to use some very long method calls. On your super-duper high-resolution screen with its bucket loads of pixels and miraculous ability to scroll horizontally, you will rarely find yourself having to break up method calls and variable declarations across multiple lines.
Sadly, although paper (and its digital cousins) is a very impressive piece of technology, it has some pretty hard limits regarding horizontal space. Because of this, we often break up our example code over multiple lines.
To be clear, this isn’t us saying your code needs to be done like this; it is just how we are doing it so that it fits into the book you now hold in your hands. So, if a line of code looks a bit weird and you are wondering why we broke it up over loads of lines, now you know why.
We aren’t really going to go into code style in this book. It is a big topic worthy of its own book (such as Swift Style, Second Edition (Pragmatic)), and it’s mostly subjective and a very easy way to start arguments, so we aren’t going to bother. Each person has their own approach that works for them, so stick to that and you’ll be better than trying to emulate someone else’s.
You do you!
If at any point you aren’t feeling comfortable with typing in code based on a book, we’ve put up a repository with all the code examples here on GitHub for you to use at any time. And, we have additional resources on our website.
Examples shown are written in Swift 5.x and screenshots depict Xcode 11-ish, running on macOS Catalina. They might not behave the same on versions before or much after that. Swift is stable now, though, so things should work even if the user interface on Xcode looks a bit different.
Everything should work for quite a long time, even a while after this book’s publication date.
We also extensively use Swift extensions in our code examples. Swift extensions allow you to add new functionality to existing classes and enumerations. It’s easier to show than to explain.
If you’re a grizzled Objective-C programmer, you might remember categories. Extensions are kind of like categories, except extensions don’t have names.
Let’s say we have a class, called Spaceship
:
class
Spaceship
{
var
fuelReserves
:
Double
=
0
func
setFuel
(
gallons
:
Double
)
{
fuelReserves
=
gallons
}
}
Hopefully this code is self-explanatory, but just so that we’re on the same page, we’ll explain it anyway: it defines a class named Spaceship
, and the class has one variable called fuelReserves
(it’s of type Double
, but this doesn’t really matter right now).
The Spaceship
class also has a function called setFuel()
, which takes a Double
named gallons
as a parameter (we’re not sure what a gallon is because we’re Australian, but our editors tell us this will make sense). The function sets the fuelReserves
to gallons
. That’s it.
Let’s say that somewhere else in our code we want to use the Spaceship
class, but add a function that lets us print our fuel reserves in a friendly manner. We can use an extension for that:
extension
Spaceship
{
func
printFuel
()
{
(
"There are
\(
fuelReserves
)
gallons of fuel left."
)
}
}
You can now call the printFuel()
function on an instance of class Spaceship
, as if it were always there:
let
starbug
=
Spaceship
()
starbug
.
setFuel
(
gallons
:
100.00
)
starbug
.
printFuel
()
We also can use extensions to conform to protocols:
extension
Spaceship
:
StarshipProtocol
{
// implementation of StarshipProtocol requirements goes here
}
In this code example, we make the Spaceship
class conform to the StarshipProtocol
. We’re going to use extensions a lot, so we wanted to make sure it makes sense.
If you need more information on extensions, check out the Swift documentation. Likewise, check the documentation if you want a refresher on Protocols.
When we talk about AI and Swift, we often get people who come up to us and basically say, “Don’t you mean Python?” (sometimes they don’t actually say it, but you can see in their eyes that they’re wondering it).
We also get people who say, “artificial intelligence? Don’t you mean machine learning?” We’re mostly using the terms interchangeably because it doesn’t really matter these days, and it’s not an argument we want to entertain. Call it whatever you like; we’re happy with it. We touch on this discussion in a moment, too.
Here’s why we like Swift for AI.
Swift, the Apple-originated open source systems and data science programming language project, was designed to create a powerful, simple, easy-to-learn language that incorporated best practices from academia and industry into one multiparadigm, safe, interoperable language that is suitable for anything from low-level programming to high-level scripting. It succeeded, to varying degrees.
Swift is an in-demand, fast-growing, popular language, with estimates of more than a million programmers currently using it. The community is diverse and vibrant, and the tooling is mature. Swift is dominant in the development of Apple’s iOS, macOS, tvOS, and watchOS, but is gaining ground on the server and in the cloud.
Swift has many great attributes. It’s a great choice for learning how to develop software and real-world development of all kinds, including the following:
Swift iterated fast and loose early on, and it made many language-breaking changes through its early life. As a result, Swift is now consistent, elegant, and fairly stable syntax-wise. Swift is also part of the C-derived and inspired family of programming languages.
Swift has a shallow learning curve and, for the most part, is built around the design goal of progressive disclosure of complexity and high locality of reasoning. This means that it’s easy to teach and can be quick to learn.
Swift code is clear, easy to write, and has minimal boilerplate. It’s designed to be quick to write and, more important, easy to read and debug.
Swift is safe by default. The design of the language makes it difficult to create memory safety issues, and it’s easy to catch and correct logic bugs.
Swift is fast, and has sensible memory use. This is likely due to its origins as a mobile programming language.
Swift compiles to native machine code. There’s no annoying garbage collector or bloated runtime. A machine-learning model can be compiled to its own object file and a header.
In addition to all this great stuff, beginning around mid-2018 the language team announced and demonstrated a greater focus on language stability: a factor that was a deal-breaker for some earlier in the language’s lifetime. This means less refactoring and refamiliarization for everyone involved, which is a good thing.
Although many non-Swift developers regard it as a language that exists solely for the Apple ecosystem, and specifically for application development therein, this is no longer the case. Swift has evolved into a powerful and robust modern language with an extensive feature set that we (the authors) believe is broadly applicable for machine-learning implementation and education.
Apple has released two key tools in recent years to encourage such pursuits on the platforms:
A framework and corresponding model format that enables highly portable and performant use of trained models. Now in its second release, its features focus on computer vision and natural language applications, but you can adapt it for others.
A framework and an app designed to create and evaluate CoreML models. It makes training machine-learning models a simple and visual process. Released in 2018, and improved in 2019, it presented a new option for those who had previously been adapting models from other formats for use with CoreML.
We look at CoreML and CreateML a whole lot more in Chapter 2.
The push for smarter applications does not stop there. With recent offerings such as Siri shortcuts—a feature set that exposes the personal assistant’s customization abilities to the user—Apple has participated in the broad cultural movement toward personalized and intelligent devices down to a consumer level.
So users want smart things, and we now have a cool language, suitable platforms, and some great tools to make them with. Still, many will not be drawn from camps of other languages they already know on the basis of that alone. Well, what if they could use some of the tools they already know? You can do that, too.
For the Python-inclined, Apple offers Turi Create. Turi Create was the go-to before CreateML came onto the scene: the framework for fast and visual training of custom machine-learning models for use with CoreML. Primarily for use with tabular or graph data, you could step into Python, train a model, visualize it to verify its suitability, and then jump right back into your regular development flow.
But we’re here to do stuff with Swift, remember?
In comes Swift for TensorFlow. Famously bemusing Python die-hards with a project page that proudly boasts “…machine learning tools are so important that they deserve a first-class language…,” the exposure of TensorFlow and related Python libraries to be accessed directly in Swift can be more fairly described as leveraging the best of both languages. Swift provides better usability and type-safety, adding the full power of the Swift compiler without compromising the sheer flexibility of Python’s machine learning.
A recording of Chris Lattner (the original author of the Swift language, among many other things) announcing Swift for TensorFlow at the TensorFlow Dev Summit in 2018 is available on YouTube and covers the reasons behind the project in greater detail.
If you are after even more reasoning, the Swift for TensorFlow GitHub page has a very in-depth great article explaining why they chose Swift.
Nowadays, the decision of what tools to use for AI with Swift is pretty simple: if the application is images or natural language, or if the purpose is for learning, use CreateML; if the application is raw data (large tables of values, etc.), the input data is significant, or further customization is required, use TensorFlow.
We cover more about the tools in Chapter 2, and we look very briefly at Swift for TensorFlow in Chapter 9.
Before we get into the nitty-gritty of implementing AI with Swift, we must diverge from the practical for a moment for those who do not come from an AI background.
AI is a field of research and methods attempting to grant technology the appearance of intelligence. What is or is not AI is heavily debated because there is a fuzzy line between a system giving an answer it was told to and giving an answer it was told how to figure out.
AI fundamentals will often include architectures such as expert systems: a type of application designed to supplement or replace a domain expert for highly specific information retrieval or decision-making. These can be constructed from specialized languages or frameworks, but at their core, they boil down to something that could be represented by many nested if statements. In fact, we would wager there are some out there that are. For example:
func
queryExpertSystem
(
_query
:
String
)
->
String
{
var
response
:
String
if
query
==
"Does this patient have a cold or the flu?"
{
response
=
ask
(
"Do they have a fever? (Y/N)"
)
if
response
==
"Y"
{
return
"Most likely the flu."
}
reponse
=
ask
(
"Did symptoms come on rapidly? (Y/N)"
)
if
response
==
"N"
{
return
"Most likely a cold."
}
return
"Results are inconclusive."
}
// ...
}
Expert systems have some useful applications, but are time consuming to construct and require objective answers to clear partitioning questions throughout. Although such a system at scale might appear to possess great domain expertise, it is clearly not discovering any knowledge of its own. Codified human knowledge, even though possibly exceeding an individual’s capacity for recall and response time, does not in itself make intelligence. It’s just saying what it’s told to.
So what about a system that is told how to figure out or guess an answer, or how to discover knowledge on its own? That is what most people mean when referring to AI nowadays. Popular approaches such as neural networks are at their core just algorithms that can take in a large amount of data—comprising of clear attributes and outcomes—and identify links at a level of scrutiny and complexity a human could not. Well, maybe if they had a very long time and a lot of paper.
The point is that everything involved in AI is not magic: it does not comprise individual steps that a person could not do; it is instead doing simple things much faster than a human is able to do.
“Hang on, why is it useful to identify links in past data, and how does that make a system intelligent?” you might ask. Well, if we know very well what conditions or attributes lead to which outcomes when those conditions or attributes arise again, we can, with some confidence predict which outcome will occur. Basically, it gives us the ability to make a much more informed guess.
With this in mind, it becomes clearer what AI isn’t and what it cannot do:
AI is not magic.
AI does not produce output that should be trusted on sensitive issues.
AI cannot be used in applications where total accuracy is important.
AI cannot identify new knowledge where there isn’t an existing abundance.
This means that AI can look at a large amount of data and use statistical data analysis to show correlations. “Most people who bought book X also bought book Y.”
But it can’t turn that information into action without external input or design. “Show people who bought book X (and who have not previously bought book Y) a recommendation for book Y.”
And, it can’t extrapolate information to cover new variables without being given more information. “Who is most likely to purchase new book Z that has not been purchased by anyone yet?”
You might have noticed that, in addition to the terms “machine learning” and “artificial intelligence,” sometimes people refer to deep learning. Deep learning is a subset of machine learning. It’s a bit of a buzzword, but it also—kind of—refers to the kind of machine learning and AI that relies on repetition, over many layers, to perform tasks.
Deep learning is about using more and more complex layers of neural nets to further extract the actual relevant information from a dataset. The goal is to convert the input into an abstract representation that can then be used later on for various purposes such as classification or recommendations. Essentially, deep learning is deep because it does a lot of repeated learning through layered neural nets.
Depending on your sources, you might be forgiven for assuming that AI and machine learning were only about neural networks. That’s not true, and has never been true, and will never be true. Neural networks are just the big buzzword and one of the central themes for the hype that exists around these topics.
Much later, after all the practical tasks in Part II, we look at neural networks in a more theoretical capacity in Chapter 10, but this book is here to look at the practical. The truth is, it doesn’t matter whether you don’t care about neural networks these days; the tools are good enough that you can build features without knowing, or caring, how they work.
AI can be used for evil. No surprise there; arguably everything can. But we humans figured out how to make intelligent technology quite a while ago and it got rather popular before much effort at all was put into making intelligent technology that could explain itself. In this area, AI research is in its infancy.
Now, if we had any other type of system and it occasionally got something wrong we would debug it and figure out where and why it’s going wrong.
But we can’t do that here.
If we had any other type of system for which output was derived from input and we couldn’t change the system itself, we might examine and attempt manual analysis of the input that was causing errors.
But we can’t do that here.
A fundamental issue with a smart system that has no way to explain itself is often solved only by starting nearly from scratch.
This lack of agency to change a system in deployment leads to a perceived lack of responsibility—people creating systems they claim do not necessarily represent their views and whose mistakes they refuse to be held liable for.
A recent example: a photo-editing app released for mobiles boasted modes that would tweak your photos to a handful of presets. Feed in one selfie and it would return versions of the picture that were gender-swapped, older, younger, and so on. One of the output categories claimed to tweak the photo to make the individual look “hotter,” but users with dark skin quickly noticed that this feature unanimously lightened their skin color.
Justifiably, users were hurt by the implication that having lighter skin would make them more attractive. The developers reeled from claims of “racist design,” but admitted that the input datasets for their application had been built in-house. In labeling countless photographs of people for their AI to train on, they had imparted their own inherent biases: they personally found people of European descent attractive far more often.
But an algorithm does not know what racism is. It knows nothing of human bias or of beauty or of self-esteem. This system was given a large number of photographs, told which ones were attractive, and asked to replicate whatever common attribute it identified. One was clear.
And when they tested it in-house, with what happened to be a Caucasian populace, this never occurred. So the product shipped.
Countless examples with more far-reaching consequences have suffered similar issues: a car insurance system that decided all women were dangerous, a job search that determined being called Jared is all that matters, a parole system that decided all dark-skinned people would reoffend, and a health insurance system that devalues people who buy food from fresh markets—due to assessing a person’s diet and health status based on credit card transactions from grocery stores, defaulting to low.
So, it is important to understand that although AI is a super-cool and interesting new area of technology that we can use for greater advancement and good, it really can only replicate existing conditions in our society. It fails to be ethical.
That is not to say that the designer/developer does not have this power. Input data can be molded to represent the world we want rather than the world we have, but targeting either will result in inaccuracy for the other. Right now it MIGHT be that the best we can hope for is awareness and acceptance of responsibility.
Do that, at the very least.
Even ethical use of AI is not always effective or appropriate, however. Many get too caught up in making a smart system full of future tech that is all the rage right now; they stop focusing on the problems they are trying to solve or the user experience they are trying to create.
Suppose that you could train a neural net that would tell someone how long they would likely live. It used all historical and current medical research available, hundreds of years of treating and observing billions of humans, and this made it reasonably accurate—barring death by accident or external force. It even accounted for the growing average lifespan—the whole nine yards.
A person opens this application/web page/whatever, enters all their information, submits it, and gets an answer. Just a number, which they could be confident would be true. No explanation. It has none to give that would mean anything to a person anyway.
This would likely cause havoc even with people who received favorable responses.
Because a person does not want a magic answer. Sure, they’d love the answer to a problem that a person could not solve, but a response alone is not a solution. Most humans would like to know about their projected lifespan so that they could extend it or know how to improve their quality of life later on. The real question here is what is likely to go wrong that they could do anything about, and that answer is not present in the system’s response.
An expert system could do a better job of it: it might know less, but it can explain itself, give actionable answers with human interpretable explanations. In most applications, this is what the design of a smart system should strive for beyond all else, at the expense of intelligence or even a bit of accuracy. The neural network solution has failed to be effective.
A final example: an industry organization struggles to retain members through first-time renewal. The governing board agonizes over how to keep people on board, but notices communications have low subscription, readership, or response rates. It identifies that the majority of new members cannot be contacted even halfway through their first membership term, and canvas widely for solutions. The board wants to know how to get more information about its members so that it can decide what is wrong or contact them some other way.
Solutions proposed include data scraping, engaging data agencies, or hiring management consultant—the organization begins pulling out all the stops. It’s data. The organization just needs more customer data and then the computer will tell it what to do. The organization will make a system that will identify members with a high risk of nonrenewal and what methods of retention are most effective. The system will be great, the board tells itself.
But the problem here is not that the organization has too little data about its members. The problem is that it cannot contact its members, cannot sufficiently demonstrate value to members so that these members stay around, and the organization doesn’t know its members or industry beyond what little the data reveals them. Maybe the board doesn’t even care to know. It has strayed from the question that it was originally asking, and now failed to devise a solution that is appropriate for its purpose.
With these tenets in mind—ethical, effective, and appropriate—go forth now and learn to create AI-powered features for yourself. (That’s what the rest of the book is for, so that’s where we’re hoping that you’ll go forth.)
We wanted to write this book because we were sick of books claiming to teach useful, practical, AI skills that began by exploring the implementation of a fascinating—but ultimately entirely useless without context—set of neural networks. Most of the books we read about AI were actually really good, but they began with a bottom-up approach, going from algorithms and neural networks, through to implementations and practical uses at the very end.
The typical approach that we are going to take throughout this book will be a top-down approach in which we break up each domain of AI we’re looking at into a task to solve.
This is because rarely do you start with,"I want to make a style classifier system,” (although if you do, that’s also cool) you generally start with, “I want to make an app that tells me if this painting is in the style of Romanticism or Pre-Raphaelite” (a problem we’ve all faced at some point in our lives).
Every section in the chapters found in Part II start with the problem we want to solve and end with a practical system that solves it. The general process is the same each and every time, with only minor changes:
A description of the problem we are tackling
Deciding on a general approach to solve it
Collecting or creating a dataset to use to solve it
Working out our tooling
Creating the machine learning model to solve the problem
Training the model
Creating an app or a Playground that uses the model
Connecting the model into the app or Playground
Taking it out for a spin
The reason we are taking this approach is simple—this book is called “Practical” Artifical Intelligence with Swift, not “Interesting but Divorced from the Reality of Using Such a System in the Real World” Artificial Intelligence with Swift. By wrapping our approach into tasks, we are forcing the elements in this book to be those that best solve the problem at hand while also requiring that it can be used in a practical manner.
Almost all of our chapters use CoreML as the framework to run the models we create, so anything we do will need to support that.
In some respects CoreML is an arbitrary line we’ve drawn in the sand, but on the other hand, it gives us clear constraints to stick within. We’re all for clear constraints.
A great deal of work on AI is done by academics and giant corporations who, despite their intents, have very different goals and restrictions on training, inference, and data resources from those who must then take their work and try to make it usable for themselves and their (generally) smaller needs and resources.
Models and approaches that might be fascinating to build and study but require massively powerful machines to train and run, or those that are interesting to look at but don’t solve any specific problem, simply aren’t practical. You can run everything in this book on a regular desktop or laptop computer and and then compile and run it on an iPhone. In many ways, we feel this is the essence of practical.
Because we are breaking up this book into tasks, we can’t just stop half way; we need to finish each task by creating something usable. Too many AI approaches either handle the creation of the AI models or cover connecting the model to a user-facing system. We are doing both.