Categories

## Leaky method references

After reading my last post regarding __del__, you should know that __del__ + reference cycle = leak.
Let’s say that you do need to use __del__, so you decide to avoid reference cycles. You write your code in such a way as to use the minimum necessary cycles, and for the ones that remain you use the weakref module.

You might still have cycles where you don’t expect it – in references to methods.
Consider the following piece of code. Can you spot the reference cycle?

```class A(object): def f(self): return   a = A() a.g = a.f```

This code has the following reference cycle: a -> a.g -> a.f -> a.
How come?
When you call a.f like so: “a.f()” two things happen:
1. A.f is bounded to a
2. The bounded A.f is called with the first parameter getting the bounded value.

You may consider that “a.f” is syntactic sugar for the partial function application, A.f gets a as a first argument but doesn’t get called yet.

When you use “a.g = a.f” what actually happens is a holding a reference to a bounded method, which holds a reference to a.

An idiom that uses these cycles is implementing state machines. Consider the following example code:

```class MyMachine(object): def __init__(self): self.next_func = self.state_a def run(self, input): for x in input: self.next_func(x) def state_a(self, value): print 'a: ', value self.next_func = self.state_b def state_b(self, value): print 'b: ', value self.next_func = self.state_a```

Of course, my code was a bit more complicated than that, but the basic idea remains. (My code usually created some kind of function table in __init__ used to lookup the next function, and lookups happened outside “state functions”). I’ve seen many state machine recipes include method references – and rightly so. It’s a clear and easy way to code a state machine. (For example, this state machine recipe).
Be careful though – once you add __del__ to these simple recipes you might end up with a memory leak.

Short note: I was going to publish this post a few days ago, but kylev beat me to it. This just goes to show that other people encountered this kind of cycle.

Categories

## Python Gotchas No. 2: Garbage Collection Oddities

Python is a garbage collected language. The garbage collector will collect orphaned objects. These are objects that have no references.
If an object has a __del__ method, it will be called when that object is collected. Note however, that there are no guarantees as to when this will take place. This means that while you should release all owned resources in the __del__ method, you should not depend on it to release these resources when the object “goes out of scope” as in C++.
Specifically note that del x does not call x’s __del__ method, just removes this specific reference to x.

There is another interesting caveat regarding Python’s gc. Consider the following code:

```class A(object): pass   a = A() b = A() a.x = b b.x = a del a del b```

Did a and b lose their references? Well, they didn’t. They’re still pointing at each other, creating a reference cycle. Happily, Python’s garbage collector can still handle those. However, it won’t be able to handle reference cycles if at least one object in the cycle has a __del__ method.
To understand why, consider the above example, only this time assume A has a __del__ method that calls self.x.release().

Now, which __del__ method should be called first? if it is called, is the other one still valid? Python refuses the temptation to guess, and leaves the cycle be, creating a memory (or resource) leak.
The solution?
1. Avoid data structures with reference cycles. For instance, there are very few instances where you’d need a doubly linked list :)
2. If you do need reference cycles, consider using the weakref module, which allows you to create references that “don’t count” in the eyes of the gc.

1. The __del__ method.
2. The gc (garbage collector) module.

Categories

## Python Gotchas 1: __del__ is not the opposite of __init__

After discussing my last post with a friend and talking about a few other issues, we came to the conclusion that it would be worthwhile to discuss more gotchas.

First though, what is a gotcha? Wikipedia gives a good definition:

In programming, a gotcha is a feature of a system, a program or a programming language that works in the way it is documented but is counter-intuitive and almost invites mistakes because it is both enticingly easy to invoke and completely unexpected and/or unreasonable in its outcome.

If you come from c++ or a similar background, you are probably well versed in object oriented concepts, specifically, constructors and destructors. The usual expectation is to have the destructor called only for fully constructed objects – i.e., objects whose constructor returned without raising an exception.

If the constructor raises an exception, it is expected to “clean up after itself”, and not expect the destructor to run.

Since in Python __init__ is the de-facto constructor, and __del__ is considered the destructor, most people expect this line of reasoning to work with __init__ and __del__.
This is mistaken. __del__ is not the opposite of __init__, but rather of __new__. Which means that if __init__ raises an exception, then __del__ will still be called.
I’ve run into this issue myself several times in the past. Consider the following sample code:

```class A(object): def __init__(self,x): if x == 0: raise Exception() self.x = x def __del__(self): print self.x```

This code demonstrates a common case: a constructor that might fail, and a destructor that does something with the instance’s members. If you try to instantiate A with x = 0, you’ll get an exception. This is to be expected.
However, what is less expected is when the partially constructed A is garbage-collected (which may be anytime later, and not necessarily right away):

```Exception exceptions.AttributeError: "'A' object has no attribute 'x'" in <bound method A.__del__ of <__main__.A object at 0x02449570>> ignored```

What happened is that __del__ was called even though __init__ raised an exception. When __del__ tried to access self.x it got an attribute error, because it hasn’t been defined yet.

The solution?
1. Don’t use __del__ unless you really have to. I’m going to write a more about it soon.
2. If you do use __del__ make sure you are covered for any case in which __init__ didn’t finish running.