2.1. Values and Data Types

Computers, and the programs that direct them, operate on data. Data is information. When the example program from the previous chapter reads in text from a file, that text is data. Each word it extracts from that text is a piece of data. The number it calculates, counting the number of words starting with a given letter, is data. That letter itself is data.

Storing and manipulating data is at the heart of everything a computer can do. In order to write programs, then, we will have to learn how to tell the computer to store and manipulate data. We’ll start with individual pieces of data.

In programming, we use the word value, rather than data or datum, to talk about a specific piece of information – like a word or a number – that a program works with. A few values we have seen so far are 0, 1, and "Hello, World!".

There are many kinds of data. You might already have noticed that 1 is a very different kind of thing than "Hello, World!". Therefore, values are classified into different data types.

'Cat' is text information. The text data type is called string, because text is a string, or sequence, of characters. In Python, strings are always enclosed in quotations marks, like this: 'Cat', or this "Cat".

93 is numeric information. There are a few different numeric data types. This particular value is an example of the integer data type.

A value’s data type controls what the computer can do with that value. Because 'Cat' is a string and 93 is an integer, certain kinds of actions (what we will call ‘operations’) make sense with one but not with the other. For example, if we try to divide two strings we get an error.

On the other hand, dividing two integers is an action the computer can perform:

If you are not sure what the type of a particular value is, the type() function can tell you:

Not surprisingly, strings belong to the type str and integers belong to the type int. Less obviously, numbers with a decimal point belong to a type called float, because these numbers are stored in the computer in a format called floating point.

What about values like "17" and "123.45"? They look like numbers, but they are in quotation marks like strings.

They’re strings! It’s important to understand and remember that "17" and 17 are very different things to Python.

2.1.1. Strings

We have used two different kinds of quotation marks to create strings: single ' and double ". Python will take whatever follows a quotation mark as the contents of a string up until it finds a matching quotation mark. Strings enclosed in one kind of quote symbol can contain the other kind. For example single quotations ' can be wrapped in double " and double " can be wrapped in single.

What do you think will happen if a string contains a quotation mark of the same kind that encloses it?

These produce syntax errors because the quotation mark that we want to be inside the string actually ends the string, and then the rest of the line is invalid Python syntax. See if you can get the code above to work by changing the type of quotation marks used.

There is another way to fix this issue. To include a quote character that is the same as the one used to start and end the string, the character can be escaped by putting a backslash \ in front of it, as in "The string \"four\" is four characters long.".

Escaping with backslashes is used in many instances when we want Python to read something as text, not just with quotation marks.

And by the way: since strings are sequences of characters, and emoji are just sequences of characters…

2.1.2. Numbers

When you type a large integer, you might be tempted to use commas between groups of three digits, as in 1,000,000. This is not a valid integer in Python, but it is valid syntax:

Well, that’s not what we expected at all! Python interprets 1,000,000 as three comma-separated integers, which it prints with spaces between.

Note

The print() function will print as many different values as you give it, as long as they are separated by commas. The values will be separated by spaces in the output.

For example:

>>> print("Hello, World!", 1, 2, 123.45)
Hello, World! 1 2 123.45

This is the first example we have seen of a semantic error: the code is syntactically valid and runs without producing an error message, but it doesn’t do what we thought or wanted it to do. In this case, Python’s rule about what commas mean doesn’t exactly match what we might assume about them based on using commas in everyday writing.

Caution

Programming languages are formal languages with strict, precise rules about what is valid code and what that code means. The computer will do exactly what you tell it to do… so be careful about what you tell it to do!

2.1.3. Type Conversion Functions

Often data is in one form and we need it in another. For example, if a data set is stored in a text format, every value will be stored as a string even if it is really numeric data. Python provides a few type conversion functions that will attempt to convert data from one type into another. Each of the three data types we’ve seen so far has a matching function that converts into that type:

  • int()

  • float()

  • str()

The int() function can convert a floating point number or a string into an int. When given a floating point number, it discards the decimal portion of the number, called truncation towards zero on the number line. For example:

Python won’t always succeed in converting from one data type to another.

The error shows that a string given to int() has to be a syntactically valid integer. Anything else will cause an error.

The float() function converts an integer, float, or syntactically valid string into a float.

And finally, str() can convert just about anything into a string. The applications of this are a bit less common, but it’s worth remembering it exists.

Check your understanding

Q-1: For each value, write its type - int, float, or str - to the right.

1234:

12.34:

"1234":

'12.34':

"Hello, 1234!":

    Q-2: Which of the following are valid strings in Python? (Mark all that are correct.)

  • 'Average'
  • Nothing wrong with this one.
  • '"Cheese!", she exclaimed.'
  • Strings can contain quotation marks that aren't the same as the marks delimiting (surrounding) the string.
  • 'Euler's Identity'
  • Strings cannot contain qutation marks that are the same as the marks delimiting (surrounding) the string unless they are escaped (see above).
  • '👁️❤️🐍'
  • Emoji (or more broadly, Unicode characters) are allowed.
  • "Hello, World!"
  • A classic string.

Q-3: For each type conversion function call, write the value it will produce to the right.

int(1234):

int(8.8):

float("1234"):

float(42.42):