3.5 Applying Data Type

When writing source code, different programming languages require varying levels of detail when defining a data type. Some languages even require you to be very specific. When you declare a variable, you can say, for example, it is a number, it is going to be an integer (a whole number), it can never be negative, it has a maximum value of 65,535 and so on.

Other languages are more flexible. They do not require, and some don't even allow, you to be that specific or that exact about each type of data. But there are downsides to flexibility because if you can't enforce rules on your data upfront, you may have to write more code later on to make sure that the data you have is actually correct. But even with differences like these, most languages treat data in a very similar fashion. So, let's cover the concepts that most languages use most of the time.

Numeric Variables

First, numbers. when dealing with number values, it's common to make a distinction between whether you have an integer (a whole number with nothing after the decimal point) or whether you have a value that should allow a fractional part to it.

Image

Fig 3.5.1: Examples of Integers and floating-point numbers (numeric variables)

If you know the piece of data you're describing as a whole number. For example, age, the number of pages in a book, the number of full-time employees, the speed limit on a stretch of road, that position in the New York times bestseller list, the floor in an office building, and so on. These are all authentic integers. They don't have anything after the decimal point. No book ever reached position 2.7 in a best seller list. There is no road with a posted speed limit of 0.00015.

But if you need a number that does support a fractional part, whether that's a temperature of 72.4 or measures a snail speed of 0.029 miles per hour, then we need that type of data and an integer won't work. The most common keyword in programming for creating an integer data type is either integer or just int for short. The most common keyword for the other kind of number is a float, meaning a floating-point number.

Now, I could do an entire course on the technical differences of how integers and floating-point numbers are stored in memory and what that might mean when you do complex calculations. I'm not going to do that here. It would make your eyes (and mine) glaze over.

Yes, this subject of integers and floating-point values does get deeper than just the question, do we or do we not have something after the decimal point. But for now, that’s a good enough question because you can get pretty far with just that level of understanding. Beyond whether you're dealing with integers or floating-point numbers, some languages let you be really specific about whether you need each number to support both positive and negative values.

Image

Fig 3.5.2: Some codes for controlling signs and sizes of integers

We use different keywords if we do, if we don't or if we just don't care about having negative values. You may also find choices for different sizes of number. So, if you're dealing with either very large or very small values, you can define them that way. Study the guide shown in Fig 3.5.2. Some sample codes are given below.

int score : The variable named score” can be either negative or positive integer

unsigned int age : The variable namedage” cannot be a negative integer.

long long int score : The variable named score” can be a very long integer.

Enough about numbers. Here's a far simpler data type.

Boolean Values

Most languages have a Boolean data type, sometimes just called a Bool, named after the British mathematician George Boole. This is a value that can be either true or false. That's it. No other options.

Image

Fig 3.5.3: Examples of Boolean values in various languages

It's very common to want variables like this in our programs. For example, variables that can hold “is the user logged in right now, true or false?” “Are we recording right now, true or false?” “Is this person on active duty, military service, true or false?” “Has our spaceship collided with an asteroid, true or false?”

In most languages you will find a Boolean data type and also find that true and false are keywords. Okay. Some context here. This kind of value, something that is just true or false, can sound so simplistic or so basic that it's kind of easy to think, oh, okay, I guess I might need that kind of value or occasionally. Let me show you that you will.

You will rarely write more than a few lines of source code without asking that really basic fundamental question of whether some situation is true or false.

A Boolean value is not a concept that we use once in a while or just occasionally. It is something we use in programming all the time.

Text/Character Data type

We also expect a data type for textual content, characters, words, sentences and paragraphs. Many languages have a data type for a single character, just one letter at a time, but what's typically more desirable is a larger amount of text. This is what in programming is called a string.

A string is a collection of characters all strung together in a sequence.

Image

Fig 3.5.4: Examples of text/character data type

Strings are very easy to create, but it's worth considering that a string is more complex for the computer than a number or a Boolean. It’s simply because we don't really know how big it is.

Image

Fig 3.5.5: Examples of string values for different languages

If you just declare a variable as a string, until you put some data in that variable, the computer doesn't know if it's going to contain a word or paragraph or an entire document.

Until you provide a value, the computer doesn't know how much memory to set aside so it's worth taking a look at these values.

Image

Fig 3.5.6: Different types of literals

When we assign a value to a variable, we just write that value by itself and we can change the variable again and again. So, in Fig 3.5.6, myInteger, myFloat, myBoolean and myString are my variables because they can change from moment to moment. But 99, 542.5, true and “This is a message!” are the values that I'm assigning to the variables. These are not variables themselves. They are the actual literal values.

So, the first in Fig 3.5.6 is an integer literal because the value 99 it's not a variable. We don't ask, what does it contains? It is literally just the value 99. Next, we have a floating-point literal. In some languages you would follow what's after the decimal point with an f at the end to say this is a float. After that is a Boolean literal, just true or false. In almost all languages, a string literal is written with double quotes to mark the beginning and the end of it.

First, that's how we can see the difference between the words we want to be using as variable names (such as myString) in our code and the words we wanted to use in our actual string data (such as “This is a message!”). Secondly, it's how we can have spaces within the string (such as the space between This and is). Because they're inside the double quotes, we're not confusing the language.

One thing to point out is notice that with the Boolean literal value of true in Fig 3.5.6, there is no quotes around the word true, so there would equally be no quotes around the word false if it were used in Fig 3.5.6.

I can certainly use the words true and false within a string, but in Fig 3.5.6 they're not strings. They're actual keywords in the language so when they're used as Boolean values, they don't need quotes. Each language provides certain basic built-in data types. They're already in that language. They're ready for us to use.

Built-in Primitive Data Types

It is true not all languages have exactly the same options. Some languages like C++ offer a variety of types within each category, not just integers and floating-point numbers, but large and small versions - versions with and without negative numbers.

Image

Fig 3.5.7: Some primitive data types for C++ and JavaScript

On the other end of the simplicity scale, JavaScript only has one data type for all numbers. They're just numbers and they're all stored as floating-point values. So even if you want a whole number (an integer), it's actually still stored as a floating-point number with just zero after the decimal point.

Simpler built-in data types in a language, whether it's integers, Booleans or floats are sometimes called primitive data types because they are basic building blocks of the language. They are generic ways to work with simple, straightforward, single pieces of data.

But we can take it deeper. What about data that has multiple pieces to it? For example, a date with a year, a month and a day. What about something like an address which could have numeric pieces and text pieces. In most languages we can do that. We can take these single pieces of data and combine them into what's called a composite or a compound data type, but that that is a topic for later.