Setting Up Your System

We’re about to embark on an ML tutorial that spans the entire first part of this book. We’ll start from scratch and end with a working computer vision program. Later on, that same program will be our starting point toward higher peaks: neural networks in Part II of this book, and deep learning in Part III.

You might want to approach this tutorial hands-on, running the source code for this book, and solving the exercises at the end of most chapters. Alternatively, you might prefer to read through and get the big picture before you grab the keyboard. Both approaches are feasible, although programmers often go for the first.

If you prefer the hands-on approach, it won’t take long to set up your system and run this book’s code. Even though ML tends to require a lot of computing power, the code in this book runs fine on a regular laptop. You just need to install some software.

First and foremost, you need Python. It’s the most popular language in the ML community, and the language that all of this book’s examples are written in. Don’t worry if you never coded in Python before—you’ll be surprised by how readable this language is. If Python code perplexes you, then read Appendix 1, Just Enough Python. That will be enough knowledge to get you through this book.

If you’re steeped in Python, you might notice that the code in this book deviates from common conventions. For example, I actively avoid language-specific idioms (such as list comprehension) that complicate the code for newcomers to the language. With the same intent of making the code more accessible, I might use slightly imprecise language, such as the word “function” in place of “method.”

I apologize in advance for those transgressions to the Pythonic canon.

Let’s get down to business and check that you have Python installed. Run:

python3 --version

If you don’t have Python 3, then stop reading for a minute and go get it. The introduction to Appendix 1, Just Enough Python gives you a few pointers to install the language.

One thing to note: on some systems, you can also execute Python 3 by typing python, without the 3 at the end. On other systems, however, the python command executes Python 2. To avoid confusing errors related to older Pythons, I’ll always use the more explicit python3 command in this book.

Now that you have the language, let’s talk libraries. You’ll need three of them to begin with. The big one is NumPy, a library for scientific computing. We’ll also use two libraries to plot charts. Matplotlib is the de facto standard for chart-plotting in Python. Seaborn sits on top of Matplotlib, and focuses on making the charts look pretty.

I hope that most information in this book will stay valid for years—but some details are bound to be obsolete by the time you read these pages. You might find that a library has been updated, leading to errors when you run certain examples. In some cases, I prevented this problem by recommending a specific library version—but even then, you might have trouble installing that version under your current version of Python.

If you experience errors when setting up your system or running the examples, check for a fresh version of the book’s source code at https://pragprog.com/titles/pplearn/source_code. I’ll try to keep the code up to date with the latest libraries, and I’ll update the setup instructions in readme.txt when needed.

Also, I’d be grateful if you let me know of any such problems by notifying me at https://pragprog.com/titles/pplearn/errata.

There are two ways to install those libraries: you can use pip, Python’s official package manager, or you can use Conda, a more sophisticated environment manager that is popular in the ML community. If you’re curious, Installing Packages with Conda delves deeper into the differences between pip and Conda. If you’re in doubt, then just use pip.

To install the libraries with pip, run these commands:

	pip3 install numpy==1.15.2
	pip3 install matplotlib==3.1.2
	pip3 install seaborn==0.9.0

…and you’re done. If you’d rather use Conda, then look in the source code’s root folder for a readme.txt with the necessary instructions.

Finally, you need some kind of coding environment. Many ML tutorials use a system called Jupyter Notebooks to edit and run code in the browser. You don’t have to use Jupyter to run the examples in this book. Being a developer, you know how to write and run a program, so go ahead and use your favorite text editor or IDE. On the other hand, if you already know and like Jupyter, that’s fine: look into the notebooks directory for a Jupyter version of the book’s code.

Let’s double-check: you have Python, a few essential libraries, and your favorite editor. That’s all you need to get started.

And now, let’s build a program that learns.

As a developer, you’re accustomed to learning at a rapid pace. With ML, however, you’re entering a new field. I’m not going to lie: the next three or four chapters are going to be tough. As you read them, you might feel like an absolute beginner—an exciting, but sometimes frustrating, place to be.

I know that feeling, and I can tell you that it’s worth getting through. I remember my excitement when, for the first time, I ran a machine learning program that came up with accurate predictions. Hold on, and soon enough you’ll know that geeky joy.

Footnotes

[6]: https://news.stanford.edu/2017/11/15/algorithm-outperforms-radiologists-diagnosing-pneumonia
[7]: https://deepmind.com/research/publications/playing-atari-deep-reinforcement-learning