Chapter 19 List-like objects and slices

19.1 Lists

The list object introduced in lectures is one of the most basic data structures in Python. We have already covered the key concepts around lists, including some of the most important functions and methods that operate on them, so here we will focus only on those aspects not already covered elsewhere.

We have seen that a list is an ordered collection of objects, and we mentioned in lectures that unlike arrays we do not have higher dimensional analogues. Later in this module we will introduce some libraries that resolve this issue, but for now we should note that it is already possible to construct such examples using the basic list object.

So far we have seen lists involving strings and numbers, but we could put other objects inside, such as another list. Suppose that we form a list entirely made up of lists such as

x = [[1, 2, 3], [4, 5, 6], [7, 8, 9, 10]]

Then x[1] will be the list [4, 5, 6] and so we can consider the element x[1][2] = 6. This looks a little like when we had an array A in VBA and looked at the element A[i,j]. So a list of lists is a bit like a two dimensional array in VBA — except that the “rows” do not have to be same length (as illustrated in our example above).

Most of the methods that we saw in lectures (append, insert, del, remove, pop, reverse, and copy) are relatively straightforward, and we do not have much more to add about these methods here. However, there is one important general point to be made that can otherwise cause a lot of confusion.

To illustrate this we will compare how the sort and sorted operations work.

The sort method changes the values of the original list. It does not return a value in its own right. So the code

x = ["cat", "dog", "rat"]
print(x.sort(), sorted(x))

would produce the output show in Figure 19.1.

A comparison of sort and sorted: the output

Figure 19.1: A comparison of sort and sorted: the output

As x.sort() does not return a value itself, the first part of the print command returns the output None which is the Python keyword for something having no value. On the other hand the sorted method does return a value (the sorted version of the string) and so the second half of the print command returns the sorted list. If we want to use sort to do this we need to use something like

x = ["cat", "dog", "rat"]
x.sort()
print(x)

The append, insert, del, remove, and reverse methods all behave like sort in that they alter the original list and do not return a value themselves. The copy method behaves like sorted as it returns a new list. The pop method is a combination of both behaviours: it modifies the original list and returns a value. When using a method it is important to remember exactly what it does.

We now return to the sort and sorted methods as these are in general a little more complicated than we saw in lectures. The syntax and features of the two methods are basically the same, so here we will focus on sort.

Given a list x the sort method has the following general syntax:

x.sort(key, reverse)

Here reverse (as we saw in lectures) is an optional parameter which equals either True
or False, and is assumed to be False by default. This controls whether we sort the list in standard or reverse order.

The key argument is more complicated. This can be set to be any function (including one defined elsewhere in the code) which is applied to each object in the list, and the list is then sorted according to the values of that function. If two elements have the same value under the given key function then their relative order is left unchanged.

For example consider the case where key = len, the length function we have seen earlier. If we apply this key to the list

a = ["a", "c", "ab", "cba", "b"]

then a.sort(key = len) converts a into

a = ['a', 'c', 'b', 'ab', 'cba']

Here the list has been sorted so that all of the strings of length 1 are at the beginning, but the relative order of these three strings has been left unchanged.

As an aside, we should note a very useful feature of Python. In the sort method (as in many others) we have optional parameters of the form parname=something. In the case of sort the parname can be either key or reverse. When we have such parameters then we can enter them in any order into our method, as the parname will identify which parameter is which. Thus, for example, the two expressions

x.sort(key=len, reverse=True) \(\quad\quad\) and \(\quad\quad\) x.sort(reverse=True, key=len)

are both valid, and so we do not have to remember which order to enter the parameters.

In lectures we assumed that all of our strings were in lower case. The reason for this is that Python — rather perversely, one might argue — sorts strings using the ASCII values of each character. What this means is that all upper letters come before all lower case letters!

Thus the default sort command will take the string

x=["cat", "dog", "Fish", "Alligator", "Camel"]

and return the string

x = ["Alligator", "Camel", "Fish", "cat", "dog"]!

If we want to sort our list in the standard alphabetical manner, we should use key=str.lower which treats all of the items as if they are lower case and then orders them. So the method \code x.sort(key=str.lower) will convert x into

x = ["Alligator", "Camel", "cat", "dog", "Fish"]

which is how we expect such a list to be sorted.

19.2 Slicing

There was quite an extensive discussion of slicing in lectures, so we do not have much to add here. Slicing can be applied to any data type that has an order. We saw it applied to lists, strings, and ranges, but we could also have applied it to tuples. As a set is not ordered we cannot apply a slice to a set.

In lectures we used slices to describe certain parts of a list or string or range. It is also possible to use slices to changes parts of an ordered object (assuming it is possible to do this — so it would not apply to tuples).

Suppose we have the list x = [1, 2, 3, 4, 5, 6]. We can modify part of such a list using a slice. If we have a list y then the command

x[:3] = y

would replace the first three entries in x by the entries of y. The replacement does not need to have the same number of entries as what is being replaced. So if y = ["red"] then the above command would change x to be

x = ["red", 4, 5, 6]

We can also use slices with the del command to remove a slice from a list.

19.3 Ranges, tuples and sets

As we saw in lectures, the range function generates a sequence of integers (depending on the start stop and step vaules that are assigned). At first sight, the idea of having a range object that is different from a list may seem a bit redundant. Would it not be simpler just to have the function make a list of the numbers generated?

When Python creates a range it does not generate all of the numbers at once — instead it calculates the next number required as it goes along. This means that it takes up very little memory and is much faster and more efficient than it if were a list object. This may not seem very important right now, but in real life we may be dealing with problems involving very large sets of data, and so being efficient becomes increasingly more important.

If you want to go through a range in reverse order you can use the reversed function, for example reversed(range(2,6)) will generate the sequence of numbers 5, 4, 3, 2.

In lectures we saw how to use a list comprehension with a range to generate a list. As mentioned there, we can also use list comprehension where the range is replaced by another list. For example the code

x = ["red", "green", "blue"]
y = [len(i) for i in x]
print(y)

will create the list y = [3, 5, 4]

Tuples are relatively straightforward, being very similar to lists except that they cannot have elements added or removed. The only subtlety involving tuples arises if we want to have a tuple containing just one element. We cannot define such a tuple by typing just a single object in brackets — instead we have to include a comma after the object. So if we wanted a tuple containing just the number 4 we would have to type (4,)

Sets are rather different from the other collections that we have considered in this Chapter, as they are unordered. Just as in a mathematical set we are also not allowed to have any repeats. So in Python we have

\{1, 2, 3\} = \{3, 1, 2\} = \{1, 2, 3, 2, 1\}.

Although sets can contain elements of different types, they cannot contain a list or another set as an element. So the notion is not as powerful as the notion of a set in mathematics.

One advantage of a set over a list is that it is stored in a very efficient way which means that Python can determine whether something is an element of a set much more quickly than for a corresponding list of the same size. As with the range function discussed above, this becomes more important as we deal with larger and larger sets of data.

We saw in lectures that we can add elements to a set using the add or update commands, and remove elements with the remove or discard functions. The other set operations that we saw involve pairs of sets, and correspond to the standard operations on sets that we use in mathematics. Each of these takes a pair of sets and creates a new set from them.

We have the union and intersection operations; given two sets a and b these are given by

a | b \(\quad\quad\) (or a.union(b))

and

a & b \(\quad\quad\) (or a.intersection(b))

respectively. The set difference function

a - b \(\quad\quad\) (or a.difference(b))

gives the set containing all elements of a that are not also elements of b. The symmetric difference function

a ^ b \(\quad\quad\) (or a.symmetric_difference(b))

gives the set containing all elements of the union of a and b that are not in the intersection of these two sets.

As with lists, we can apply the function len to a set, and also the functions min, max, and sum when the set consists entirely of numbers. We can also apply the sorted function to a set to create a list of the elements in order.