Python variables, types, collections¶

Variables¶

In Python, a variable may be considered as a kind of "label", refering to a given value. In the instruction below, the interpreter firstly creates the string "Hello World !", then the variable "text", and ensures that "text" refer to the string "Hello World !". Subsequently, each time an instruction ask for "text", it receives the value "Hello World !".

In [1]:
text = "Hello World !"
print(text)
Hello World !

Assignments¶

When you assign the value of a given variable to another, you do not duplicate the value. Basically, the two variables refer to the same shared value.

In [2]:
text1 = "Hello World !"
text2 = text1

Garbage collection¶

Since any value can be referred by many variables, you may wonder when the value should be destroyed, and its memory given back to the operating system ? Actually, the interpreter will do this automatically for you : it is regularly checking and destroying values which are not any more referred.

Multiple assignments¶

Python let you make several assignements in one instruction, with a collection of names on the left of the assignment operator =, and a collection of values on the right:

In [3]:
x, y, z = 10, "twenty", 30
print(x, y, z)
10 twenty 30

Variable names¶

The name of a variable must start with a letter or underscore (_), followed by letters, underscores or numbers. It is case-sensitive.

It is forbidden to use a Python keyword as variable name. It is allowed, yet highly dangerous, to reuse the name of an existing type name or function name.

By convention, the special variables automatically defined by the interpreter have a name starting and ending with a double underscore __.

By convention, the special variables _ and __ are often used as "throw away" variables. That is, if a function return several values, and one just cares about some of them, he will use _ for the values to be ignored : res1, _ = function_returning_two_values().

Types¶

The values have types, not the variables (which are just agnostic labels). That means a given variable, during its lifetime, can refers to different values of different types.

In [4]:
my_var = "Hello World !"
print(my_var)
my_var = 3.14
print(my_var)
Hello World !
3.14

Unless in very specific situations, it is not recommended ! Or you will quickly be confused about which variable refers to what.

Built-in Types¶

  • Numbers : int, float, complex, bool.
  • Collections : str, tuple, list, dict, set, frozenset,...
  • Specials : NoneType is the type of None ; type is the type of all types.

Numbers¶

Python has a single built-in type for integers, int, a single built-in type for floating point numbers, float, and a built-in complex type. The boolean type bool is also considered numerical.

int¶

The Python integers have unlimited precision !

In [5]:
val = 2**80
print(val)
1208925819614629174706176

float¶

The Python floating point numbers have a size of 64 bits.

In [9]:
f1 = 3.
f2 = -3.3
f3 = 3e10
f4 = 3.67

print(f1)
print(f2)
print(f3)
print(f4)
3.0
-3.3
30000000000.0
3.67
In [10]:
# absolute value
print(abs(-3.3))
3.3
In [11]:
# integer part
print(int(3.67))
print(int(-3.67))
3
-3
In [12]:
# rounding
print(round(3.67))
print(round(3.67, 1))
print(round(133, -1))
print(round(133, -2))
4
3.7
130
100

Operators¶

  • classics : +, -, /, *
  • whole division : //
  • power : **
  • modulo : %
In [14]:
print(3/2) # BEWARE : different behavior in Python 2
print(3/2.)
print(3//2)
print(3//2.)
1.5
1.5
1
1.0
In [15]:
print(10**2)
print(3%2)
100
1

You have combined operators such as +=, *=, etc. But there no ++ or --: use +=1and -=1 instead.

In [16]:
x=10  ; print(x)
x+=1  ; print(x)
x-=1  ; print(x)
x*=5  ; print(x)
x/=10 ; print(x)
x%=2  ; print(x)
10
11
10
50
5.0
1.0

Booleans¶

The type bool has two possible values : True and False. When involved in a numerical operation, True is considered to be 1and False is considered to be 0.

Collections¶

The Python built-in collection are heterogeneous : one can mix different types of object in a single collection. This is both very flexible... and sometimes badly slow.

The different kinds of collections are classified depending on their mutability (they can or cannot be modified after their creation), and depending if they are sequences (elements have a rank) or if they are associations (elements have a key):

  • immutable sequences:
    • tuple : 'a', 'b', 'c'
    • str : 'hello' or "hello"
  • mutable sequences:
    • list : ['a', 'b', 'c']
  • mutable associations:
    • dict : { 'a' : 'val', 3 : 'x', 'key' : 124 }
    • set : {'a', 3, 'key'}
  • immutable associations:
    • frozenset : {'a', 3, 'key'}

tuple¶

...is an immutable sequence of whatever values: once created, it cannot be modified. The tuple is initialized with a comma-separated list of elements. One can access an individual element with its rank between brackets:

In [17]:
my_tuple = 11, 22, 33, 44, 55
my_tuple_len = len(my_tuple)
print(my_tuple, "has", my_tuple_len, "elements")
print("from", my_tuple[0], "to", my_tuple[my_tuple_len-1])
(11, 22, 33, 44, 55) has 5 elements
from 11 to 55

When defining the litteral value of a new tuple, separate the elements with commas (,), and optionally enclose it in parenthesis. IMPORTANT : what makes it a tuple is not the parenthesis, but the commmas. This makes it tricky to define a tuple of 1 element (sometimes you need to...):

  • a or (a) is not a tuple, but an isolated element,
  • a, or (a,) is a tuple of one element. We rather recommend to always use parenthesis, and (a,) when you want to build a single element tuple.
In [18]:
a = 2
val = (a) ; print(type(val), ":", val)
val = a, ; print(type(val), ":", val)
val = (a,) ; print(type(val), ":", val)
<class 'int'> : 2
<class 'tuple'> : (2,)
<class 'tuple'> : (2,)

Operators +, *¶

...works with tuples, and with every kind of sequence:

In [19]:
my_tuple1 = 1, 2, 3
my_tuple2 = my_tuple1 + (4,)
my_tuple3 = my_tuple2 * 2
print(my_tuple3)
(1, 2, 3, 4, 1, 2, 3, 4)

Negative index¶

When a negative index is given, this means the interpreter must do reverse counting from the end of the sequence. For example [-1] is the last element from a sequence, [-2] the penultimate, etc.

In [20]:
my_tuple = 1, 2, 3, 4, 5
print(my_tuple[-1])
print(my_tuple[-2])
5
4

Slices¶

One can extract a slice from a sequence with a notation like [<begin>:<end>:<step>], where the index <end> is excluded, and has a default value of 1.

In [21]:
my_tuple = 1, 2, 3, 4, 5
my_slice = my_tuple[0:-1:2]
print(my_tuple)
print(my_slice)
(1, 2, 3, 4, 5)
(1, 3)

The produced slice is a new object. If one modifies it, it will not any more affect the original sequence.

In [22]:
my_slice += (9,)
print(my_tuple)
print(my_slice)
(1, 2, 3, 4, 5)
(1, 3, 9)

If you omit <begin>, the slice will start at the beginning of the sequence. If you omit <end>, it will start at the end. Thus, the notation [:] will produce a complete copy of the original sequence/

In [23]:
my_tuple = 1, 2, 3, 4, 5
my_slice = my_tuple[1:]
print(my_slice)
my_slice = my_tuple[:-1]
print(my_slice)
my_slice = my_tuple[:]
print(my_slice)
(2, 3, 4, 5)
(1, 2, 3, 4)
(1, 2, 3, 4, 5)

What's more, slices can also be used on the left side of an assignment, so to modify the original sequence.

In [24]:
my_list = [1, 2, 3, 4, 5]
my_list[0::2] = 10, 30, 50
print(my_list)
[10, 2, 30, 4, 50]

str (string of characters)¶

...is an immutable sequence of characters: once created, it cannot be modified. A litteral string can be enclosed with single quotes or double quotes. Prefer the double ones, because they let you insert some single quotes in the text. Otherwise, it must be "backslashed".

In [25]:
string1 = "tout est bon dans le python"
string2 = 'tout est bon dans le python'
string3 = "c'est comme ça"
string4 = 'c\'est comme ça'

Multi-lines strings must be enclosed in triple single or double quotes:

In [26]:
sentence = """
Tout est bon dans le python.
C'est comme ça.
"""
print(sentence)
Tout est bon dans le python.
C'est comme ça.

All the operators, indexing and slicing features described about tuples are also valid for strings.

In [27]:
print(3*'atchick ' + "... " + 3*'aie ')
atchick atchick atchick ... aie aie aie 
In [28]:
my_str = "bonjour"
print(my_str[-2])
print(my_str[1:-1])
print(my_str[:-2])
print(my_str[-2:])
u
onjou
bonjo
ur

Let's note also the join method, sometimes usefull to concatenate a list of strings:

In [29]:
my_letters_tuple = 'a', 'b', 'c', 'd'
my_letters = '/'.join(my_letters_tuple)
print(my_letters)
a/b/c/d

Again: on the contrary of other programming languages, strings are immutable. Any function or method whose goal is to modify a string will return a modified copy, and will not modify the original string.

In [30]:
sentence1 = "tout est bon dans le python"
sentence2 = sentence1.replace("bon","mauvais") # IT DOES NOT MODIFY sentence1
print(sentence1)
print(sentence2)
tout est bon dans le python
tout est mauvais dans le python

list¶

...is a mutable sequence of whatever values: it can be modified any time. A list is initialized with a comma-separated list of elements, enclosed in square brackets. An empty list is [].

In [31]:
my_list = [11, "33", 22, 55, 44]
print(my_list, "has", len(my_list), "elements")
print("from", my_list[0], "to", my_list[-1])
[11, '33', 22, 55, 44] has 5 elements
from 11 to 44

Since the list is mutable, one can modify element(s), remove element(s), add element(s), sort the elements, with methods or operators:

In [32]:
my_list = [11, "33", 22, 55, 44]
my_list[1] = 33
my_list.append(66)
my_list.extend([88, 99])
my_list += [77]
my_list.sort()
del my_list[0]
my_list.pop()
print(my_list)
[22, 33, 44, 55, 66, 77, 88]

dict¶

...is a mutable association of whatever immutable key to whatever value. One can mix different types of keys and different types of value... yet it is not recommended. A dictionary is initialized with a comma-separated list of pair <key>: <value>, enclosed in curly braces. An empty dictionary is {}. Elements are accessed thanks to their key, between square brackets.

In [33]:
letter_occurences = {'a': 12, 'b': 7}
letter_occurences['c'] = 9
print(letter_occurences)
print(letter_occurences['a'])
print(letter_occurences['b'])
print(letter_occurences['c'])
{'a': 12, 'b': 7, 'c': 9}
12
7
9

A dictionary is not a sequence : until very recently (3.7), the order of keys was not preserved. Each key exists only once in the dictionary. One cannot concatenate a given dictionary with another one. If you ask the value associated to a key which does not exist in the dictionary, you will get an error:

In [34]:
letter_occurences = {'a': 12, 'b': 7}
print(letter_occurences['c'])
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[34], line 2
      1 letter_occurences = {'a': 12, 'b': 7}
----> 2 print(letter_occurences['c'])

KeyError: 'c'

Associating a given key to some "false" value will not remove it from the dictionary. Use del if you want to really remove the key/value pair:

In [35]:
letter_occurences = {'a': 12, 'b': 7, 'c': 9}
letter_occurences['c'] = None
print(letter_occurences)
del letter_occurences['c']
print(letter_occurences)
{'a': 12, 'b': 7, 'c': None}
{'a': 12, 'b': 7}

Many methods can help you with your dictionary:

In [36]:
letter_occurences = {'a': 12, 'b': 7, 'c': 9}
print(letter_occurences.keys())   # list of keys
print(letter_occurences.values()) # list of values
print(letter_occurences.items())   # list of key/value tuples
print('c' in letter_occurences)
print('d' in letter_occurences)
dict_keys(['a', 'b', 'c'])
dict_values([12, 7, 9])
dict_items([('a', 12), ('b', 7), ('c', 9)])
True
False

The methods keys()/values()/items() do not return lists but iterables : one cannot use operator [] on it, unless she first explicitly transform it into a list:

In [37]:
letter_occurences = {'a': 12, 'b': 7, 'c': 9}
# print(letter_occurences.keys()[0])
print(list(letter_occurences.keys())[0])
a

Conversion of collections¶

One can easily convert some kind of collection into another kind. For example:

In [38]:
my_tuple = (11, 22, 33, 44, 55)
my_list = list(my_tuple) # the list constructor may receive anything iterable
my_list += [66]
print(my_list)
[11, 22, 33, 44, 55, 66]

Collections assignation and copy¶

When a variable refers to a collection, and this variable is assigned to another variable, then the two variables are refering to the same "shared" collection. If one modify an element of the collection, what we call an "in-place" modification, then the two variables see the modification:

In [39]:
my_list1 = [1, 2, 3, 4, 5]
my_list2 = my_list1
my_list1[2] = 0
print(my_list1)
print(my_list2)
[1, 2, 0, 4, 5]
[1, 2, 0, 4, 5]

If you want to duplicate a given sequence into another one, you can either use a complete slice ([:]), or more explictly the copy() method. This last way is more clear, and also works with dictionaries:

In [40]:
my_list1 = [1, 2, 3, 4, 5]
my_list2 = my_list1[:]
my_list1[2] = 0
print(my_list1)
print(my_list2)
[1, 2, 0, 4, 5]
[1, 2, 3, 4, 5]
In [41]:
my_list1 = [1, 2, 3, 4, 5]
my_list2 = my_list1.copy()
my_list1[2] = 0
print(my_list1)
print(my_list2)
[1, 2, 0, 4, 5]
[1, 2, 3, 4, 5]
In [42]:
my_dict1 = {'a': 1, 'b': 2}
my_dict2 = my_dict1.copy()
my_dict1['b'] = 0
print(my_dict1)
print(my_dict2)
{'a': 1, 'b': 0}
{'a': 1, 'b': 2}

Collections nesting and copy¶

One can easily nest any kind of collection within another of collection, of same type or not. Worth to note: if a given collection is immutable, that does not imply that the elements are immutable:

In [43]:
my_list = [1, 2, [3, 4], 5]
print(my_list)
my_list[2][0] = 0
print(my_list)
[1, 2, [3, 4], 5]
[1, 2, [0, 4], 5]

When one copy a given collection, only its top level is really duplicated. We call this "shallow copy":

In [44]:
my_list1 = [1, 2, [3, 4], 5]
my_list2 = my_list1.copy()
my_list1[1] = 0
my_list1[2][0] = 0
print(my_list1)
print(my_list2)
[1, 0, [0, 4], 5]
[1, 2, [0, 4], 5]

So to get a complete duplication of all nested levels, what we call a "deep copy", one can use the deepcopy() function from the copy module:

In [45]:
import copy
my_list1 = [1, 2, [3, 4], 5]
my_list2 = copy.deepcopy(my_list1)
my_list1[1] = 0
my_list1[2][0] = 0
print(my_list1)
print(my_list2)
[1, 0, [0, 4], 5]
[1, 2, [3, 4], 5]

Questions ?¶

See also

  • RealPython : variables in python
  • The standard type hierarchy
  • Built-in types