1.6. Getting the Computer to Understand You: The Machine Code

I said earlier that a computer program is a list of instructions we write in a programming language the computer understands. The sentence is useful but not entirely true because of this bit:

programming language the computer understands.

This is not entirely true. The reason is that the computer does not understand our source code. We will use a language like C#, Swift, Ruby, or Python, but the chip, the central processing unit or CPU in any computer (laptop, desktop, phone server or game console), the piece of hardware that takes care of running or executing our programs, does not understand our source code at all.

Image

Fig 1.6.1: A quad-core CPU

The only thing the CPU understands is what's called machine code or machine language, which looks like this:

Image

Fig 1.6.2 Machine Code

These are basic instructions like add one number to another number, but now written in a form the computer understands.

Now if looking at this makes your eyes glaze over, that's totally okay. That is an appropriate response. It looks like gibberish. It's intended for machines, not humans, humans, and you do not need to be able to read this.

The vast majority of professional programmers never ever deal with machine code. It is possible to write a program in machine code, and there are a handful of specialized positions where it's a skill but it's not a general programming skill. Writing machine code is incredibly time consuming. It's a long, error prone, painstaking and tedious process.

Not only that machine code is slightly different across every version of every chip and across different hardware.

The entire point of having programming languages is so we don't have to write machine code.

But if we only write source code and the chip only understands machine code, then that would suggest our source code must be converted into machine code before the program can run.

Image

Fig 1.6.3: How source code is converted to machine code

Yes, that must happen. There're two main ways to do it. We will either compile or we will interpret the source code and the difference between compiling and interpreting is not about what happens, but about when does that conversion to machine code happen?

Here's what I mean. Imagine I've written a simple source code on my computer. I want you to run my new program over on your computer. I've written source code, but your computer needs machine code.

Option number 1: Using a compiler.

I will take care of this by using a compiler. A compiler is a program that goes through my source code file instruction by instruction and converts it into machine code. The compiler spits out a new file containing machine code, so that's something your CPU can actually now understand and run or execute.

Image

Fig 1.6.4: Compilation process

The result of the compilation is called an executable file. After I've compiled it, I just give you that new file and you can run my program. When compiling you won't get my source code because it stays on my machine. You don't even know what programming language I used.

Option number 2: Using an interpreter.

Now for this, I still have my source code, but instead of me compiling it, I just give you a copy of it. You get my source code and that means you need to do the conversion to machine code yourself, and you will use an interpreter to do it.

But that doesn't mean you have to go out and download an interpreter program. You'll often find interpreters bundled inside web browsers or even operating systems themselves, and they will be used automatically.

Image

Fig 1.6.5: The Interpretation process

Probably the most common example of an interpreter is a JavaScript. It is bundled inside every web browser. So, if I just sent you some JavaScript, you can run it because your browser knows how to interpret it. A one big difference is that with an interpreter, this conversion to machine code doesn't happen in advance. It doesn't happen until you decide to run the program on your computer.

At that time the interpreter goes through that source code instructions line-by-line and converts it to machine code instructions as needed and then runs that machine code. It doesn't save it as a separate file.