Recitation 5: The Real Type System

Abstraction. You’ve been studying it all quarter long.

It’s when we isolate away all the nitty-gritty details that we don’t care about and just focus on the matter of hand. We assert that all these other important things are taken care of by people smarter than us, and we assert that we are just responsible for this one tiny bit of code.

A perfect example is cs1lib. We have provided you a list of functions, like draw_rect which draws a rectangle. How does draw_rect work? Don’t worry about it. It’s abstracted away. You don’t care how it works. For now, you’re given this tool, and you use it freely.

But as you progress through Computer Science, you start peeling back layers of abstraction – all the way down to the Operating System level, if not further. We’ve been peeling back these layers of abstraction week after week, and it’s time to pull back the final layer of abstraction on types.

At least, the final layer that we will be discussing in CS1.

Yes, this means that after this lecture we will finally have a complete picture of what the typing system in Python is and how it works.

Recap

We started out with defining primitive types. These included things like strings, integers, and floating points. The basic point of teaching this is to explain how call by value works – in other words, when you pass a parameter that is called by value, a copy of its value is given to the formal parameter. Not the actual parameter.

We then introduced complex types for things like lists. Here, we have call by reference, where the actual reference is given as the parameter, so when you make changes to the object you are referencing, you are changing the actual parameter as well as the formal parameter.

This explains the mystery of the following code:

a = [1, 2, 3]

def stuff(some_list):
    some_list.append(4)

assert(4 in a)

Call By Object

Okay great, so we’re all on the same page. Unfortunately, everything I just told you was a lie. Sort of. :)

This is yet another moment of peeling back layers of abstraction. Up until now, you had no idea what objects were, so if I told you Python used call by object, it would have been nonsense.

Everything in Python is an object. Everything. Furthermore, there are two types of objects: mutable and immutable.

A mutable object is one that can be changed. For example, a list is a mutable object. A list can be appended to, elements can be deleted, etc. There are many things that can be done to the list, as we saw in the above code example.

An immutable object is one that cannot be changed.

Let’s take a string for example. In some languages, there is a type called character, and a string is nothing more than a list of characters. In fact, they are interchangeable. Python is not one of those languages.

It may seem like it.

bro1 = "Why weren't you in class today?"
bro2 = "What? We never have class on Fridays."
bro3 = "Dude... Today is Monday..."

print bro1[0]
print bro2[-1]
print len(bro3)

Strange, isn’t it? I can take a length of a string, I can index into it… I seem to be able to do everything that I could with a regular list! Nope. There is one thing you can’t do.

bro1[0] = 'H'

This gives you the following error

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment

A TypeError! Ah-ha! We’ve seen that before. Python is saying you’ve confused the type str for a mutable type, which it is not. Strings are unchangeable! But then… how does this happen?

a = "'Sup..."
a = "Well hello there, my good fellow."

You might think that, well hey! a is of type str, and it changed its value! Doesn’t that mean that strings can change value?

Nope! Remember, everything is an object. That means that somewhere in your computer’s memory, there is an object that has the value 'Sup.... a, however, is just a variable that points to the address in memory where that object exists. So when you try and do

a[0] = 'H'

you’re trying to change the value of the object. But when you do

a = "Something else"

you’re not changing 'Sup.... Instead, you’re saying “Hey variable a, forget about that other object 'Sup.... Instead, I want you to point to this new object I’ve created.”

Python has no problem with that, and it will happily let you point a to a new object!

If this is still confusing, think about it like this. Python doesn’t really have variables. Variables generally hold information. Instead, Python has names. You can associate every object with one or more names. So when I do

a = [1, 2, 3, 4]
b = a

I haven’t created two lists, right? No of course not. Instead, I’ve given the object [1, 2, 3, 4] two names, so now Python knows when I refer to a or b, I’m actually referring to that list object. So when I first have

a = "Something"

and then change it to

a = "Something else"

it’s not the string object I’m changing; instead, I’m just taking the name a and assigning it to a different object instead! (Python doesn’t allow multiple different objects to have the same name. That would be super confusing…)

So what’s call by object? Well, it’s just call by value but for objects! Call by object is used when an object is immutable. Example?

If the object is mutable (so far we’ve only seen lists, but we’ll see others), Python uses call by object reference, where it passes not a copy of the object but the actual reference to the area in memory where the object lives.

This means that for the most part, the objects and classes you define are using call by object reference, and not call by object.

Implications

You might think this is not an important difference, but it leads to some interesting results that call by value and call by reference alone cannot explain. Let’s take an example.

In Python, you can give functions default arguments, like so.

def action_russia(olympics=False):
    if olympics:
        return "invade Crimea"

Whenever you call action_russia without a parameter, it will assign olympics to be False; otherwise, it overrides the False with the given value.

Pretty neat! This way you can effectively have two action_russia functions – one that takes no parameters, and another that takes one parameter.

But what happens when you give a mutable type as the default value? Unexpected things.

def action_russia(action, list_of_actions=[]):
    print "Russia will do: " + action
    list_of_actions.append(action)
    return list_of_actions

Simple enough. Let’s call the function and see what happens.

some_list = action_russia("Mother Russia awaits the global Communist revolution.")
print some_list

Great. We provide an action and call action_russia without the second parameter, meaning it starts out as an empty list. That means that some_list, which is returned by action_russia, should be a list containing a single element – "Mother Russia awaits the global Communist revolution.". Is that what happens?

['Mother Russia awaits the global Communist revolution.']

Great! So everything works, right? Not quite. Let’s see what happens when I do this…

some_new_list = action_russia("Invade Ukraine.")
print some_list

So it should only have "Invade Ukraine." in the list, right?

['Mother Russia awaits the global Communist revolution.', 'Invade Ukraine.']

Erm.. What?

As it turns out, function objects are immutable. The default [] that we put in the function header are created when the function is defined – once, and only once. Which means that every time afterwards it is referring to the same object.

This is one of the many reasons call by object is not the same as call by value or call by reference. Now that you know objects, we no longer need to refer to this by the incorrect names “call by value” or “call by reference”.

Let me be clear: call by value and call by reference do exist in many languages (Java, for one); and they are helpful to know for future purposes and for conceptual clarity. But in Python, we have call by object and call by object reference.