Small Programming Challenge no. 5 – Generating a Permutation

I thought of this one quite a long time ago, and I believe that the idea behind it is pretty nice mathematically. I got the idea for it from Knuth’s “The Art of Computer Programming”.

The challenge is simple:
write a function that receives as arguments two numbers, n, and num such that 0 <= num < n!. This function needs to return an array (list) representing a permutation of the numbers 0..n-1. For each possible num, the function needs to return a different permutation, such that over all values of num, all possible permutations are generated. The order of permutations is up to you.

The function you write should do this in at most O(n) time & space (Various O(nlogn) are also acceptable).
Write your solutions in the comments, in [ LANG ] [/ LANG ] blocks (without the spaces) where LANG is preferably Python :). I will post my solution in a few days. As usual, the most efficient & elegant solution wins.

Go!

Posted in Challenges, computer science, Programming | Tagged , | 15 Comments

And now for something completely different

My friend Yuval whom you might already know from the comments here, apparently composed music for the Python Zen. It made me laugh today, and as it’s been a long day, I thought it’s worth sharing here. Especially as it is Python related.

Posted in Humour, Python | Tagged , , , | Leave a comment

Leaky method references

After reading my last post regarding __del__, you should know that __del__ + reference cycle = leak.
Let’s say that you do need to use __del__, so you decide to avoid reference cycles. You write your code in such a way as to use the minimum necessary cycles, and for the ones that remain you use the weakref module.

You might still have cycles where you don’t expect it – in references to methods.
Consider the following piece of code. Can you spot the reference cycle?

class A(object):
    def f(self):
        return
 
a = A()
a.g = a.f

This code has the following reference cycle: a -> a.g -> a.f -> a.
How come?
When you call a.f like so: “a.f()” two things happen:
1. A.f is bounded to a
2. The bounded A.f is called with the first parameter getting the bounded value.

You may consider that “a.f” is syntactic sugar for the partial function application, A.f gets a as a first argument but doesn’t get called yet.

When you use “a.g = a.f” what actually happens is a holding a reference to a bounded method, which holds a reference to a.

An idiom that uses these cycles is implementing state machines. Consider the following example code:

class MyMachine(object):
    def __init__(self):
        self.next_func = self.state_a
    def run(self, input):
        for x in input:
            self.next_func(x)
    def state_a(self, value):
        print 'a: ', value
        self.next_func = self.state_b
    def state_b(self, value):
        print 'b: ', value
        self.next_func = self.state_a

Of course, my code was a bit more complicated than that, but the basic idea remains. (My code usually created some kind of function table in __init__ used to lookup the next function, and lookups happened outside “state functions”). I’ve seen many state machine recipes include method references – and rightly so. It’s a clear and easy way to code a state machine. (For example, this state machine recipe).
Be careful though – once you add __del__ to these simple recipes you might end up with a memory leak.

Short note: I was going to publish this post a few days ago, but kylev beat me to it. This just goes to show that other people encountered this kind of cycle.

Posted in Python | Tagged , , , , | Leave a comment

Python Gotchas No. 2: Garbage Collection Oddities

Python is a garbage collected language. The garbage collector will collect orphaned objects. These are objects that have no references.
If an object has a __del__ method, it will be called when that object is collected. Note however, that there are no guarantees as to when this will take place. This means that while you should release all owned resources in the __del__ method, you should not depend on it to release these resources when the object “goes out of scope” as in C++.
Specifically note that del x does not call x’s __del__ method, just removes this specific reference to x.
To be explicit about resource management, call the appropriate function yourself in the flow of your program, or better yet, use try-finally or the with statement. (For more information about those, see my presentation about advanced python subjects).

There is another interesting caveat regarding Python’s gc. Consider the following code:

class A(object):
    pass
 
a = A()
b = A()
a.x = b
b.x = a
del a
del b

Did a and b lose their references? Well, they didn’t. They’re still pointing at each other, creating a reference cycle. Happily, Python’s garbage collector can still handle those. However, it won’t be able to handle reference cycles if at least one object in the cycle has a __del__ method.
To understand why, consider the above example, only this time assume A has a __del__ method that calls self.x.release().

Now, which __del__ method should be called first? if it is called, is the other one still valid? Python refuses the temptation to guess, and leaves the cycle be, creating a memory (or resource) leak.
The solution?
1. Avoid data structures with reference cycles. For instance, there are very few instances where you’d need a doubly linked list :)
2. If you do need reference cycles, consider using the weakref module, which allows you to create references that “don’t count” in the eyes of the gc.

Further reading material:
1. The __del__ method.
2. The gc (garbage collector) module.

Posted in Programming, Python | Tagged , , , | 1 Comment

PyWeb-IL presentation: Advanced subjects in Python

Yesterday I gave a presentation at PyWeb-IL, which took place at Google’s offices in Tel-Aviv. The presentation went really well, and interested many people.

Here are the slides for the presentation, “Advanced Python Subjects”.

Posted in Python, Teaching Programming | Tagged | Leave a comment

PythonTurtle delivers!

A few days ago, a coworker asked me what tool he should use to teach another non-programming coworker some programming.PythonTurtle

I thought a little, and suggested PythonTurtle, and then also demonstrated the builtin turtle.

I thought nothing much of it, but a few days later, when I entered his office, the non-programming coworker was sitting there, trying to draw a star of david! She said he left her to do this, and that it was really cool, and then she asked me “how can I repeat commands instead of typing them again?”.

I showed her “for i in range(10)”, and then a function definition. She was really excited about those. I then helped her a little with the star of david, showed her some fractals, and gave her some new homework.
My conclusions from her reactions:

  1. PythonTurtle is especially suited for non-programmers. Because of the non-intimidating UI, she did not realize she was programming until we specifically told her she was. She thought this was some introductory stuff before she starts “real programming”. I believe this is very important, as while the classic turtle has the same capabilities, it would have been very different working from a “scary” interactive prompt. (The friendly turtle image is a definite plus in that regard! :)
  2. When teaching with PythonTurtle and using “for i in range(10)” and “def triangle():”, loops and functions were very intuitive. She was actually asking for such tools, and when I told her about them, she understood them almost immediately!
  3. Any programming teaching aid that makes the student actually excited about loops and functions should be respected :)
  4. Since the goal was to draw specific shapes, it was easy getting the “success high” associated with programming, and it kept her really excited. It was amazing to see her reaction when she finished writing the triangle function, not sure if she wrote it right, and then calling it, and seeing it draw a triangle correctly.
  5. Even with all the praise I write here, it should still be taken with a grain of salt. It’s possible that a lot of her enthusiasm was because of her personality, and less because of PythonTurtle itself.

Final verdict: next time I need to teach someone to program, I’m going to reach for PythonTurtle.

Posted in Programming, Python, Teaching Programming | Tagged , , , | 1 Comment

Python Gotchas 1: __del__ is not the opposite of __init__

After discussing my last post with a friend and talking about a few other issues, we came to the conclusion that it would be worthwhile to discuss more gotchas.

First though, what is a gotcha? Wikipedia gives a good definition:

In programming, a gotcha is a feature of a system, a program or a programming language that works in the way it is documented but is counter-intuitive and almost invites mistakes because it is both enticingly easy to invoke and completely unexpected and/or unreasonable in its outcome.

So let’s start with “__del__ is not the opposite of __init__”.
If you come from c++ or a similar background, you are probably well versed in object oriented concepts, specifically, constructors and destructors. The usual expectation is to have the destructor called only for fully constructed objects – i.e., objects whose constructor returned without raising an exception.

If the constructor raises an exception, it is expected to “clean up after itself”, and not expect the destructor to run.

Since in Python __init__ is the de-facto constructor, and __del__ is considered the destructor, most people expect this line of reasoning to work with __init__ and __del__.
This is mistaken. __del__ is not the opposite of __init__, but rather of __new__. Which means that if __init__ raises an exception, then __del__ will still be called.
I’ve run into this issue myself several times in the past. Consider the following sample code:

class A(object):
	def __init__(self,x):
		if x == 0:
			raise Exception()
		self.x = x
	def __del__(self):
		print self.x

This code demonstrates a common case: a constructor that might fail, and a destructor that does something with the instance’s members. If you try to instantiate A with x = 0, you’ll get an exception. This is to be expected.
However, what is less expected is when the partially constructed A is garbage-collected (which may be anytime later, and not necessarily right away):

Exception exceptions.AttributeError: "'A' object has no attribute 'x'" in <bound
 method A.__del__ of <__main__.A object at 0x02449570>> ignored

What happened is that __del__ was called even though __init__ raised an exception. When __del__ tried to access self.x it got an attribute error, because it hasn’t been defined yet.

The solution?
1. Don’t use __del__ unless you really have to. I’m going to write a more about it soon.
2. If you do use __del__ make sure you are covered for any case in which __init__ didn’t finish running.

Posted in gotchas, Programming, Python | Tagged , , | 6 Comments

GeneratorExit: another reason to upgrade to Python 2.6

Ever heard of GeneratorExit? Unless you write generators, and do some unusual stuff with them, you probably haven’t encountered it.

GeneratorExit is a special exception that gets raised from within a generator when it is close()-ed.
For example, consider the following code:

def func():
	f = file('temp.txt','w')
	for i in range(10):
		f.write('%d\n' % i)
		yield i
	f.close()

When iterating over func(), when will f be closed? Obviously, when the generator is exhausted. What if f.write raised an exception? It’s better to make sure then (in py2.5, do from __future__ import with-statement):

def func():
    with file('temp.txt','w') as f:
        for i in range(10):
            f.write('%d\n' % i)
            yield i

(If you don’t know what with does: basically it wraps the code in a try-finally clause, and makes sure that f is closed in the finally).

This solution is clean enough, but there’s something interesting going on. What if we use func in the following manner?

for i in func():
    print i
    if i == 5:
        break

Well, the generator func() returned is no longer used, so eventually it will be cleaned by the gc. However, the code inside func is still executing – it is “blocking” on the yield statement!

To solve this issue cleanly, the yield statement will raise an exception when the generator returned is close()-ed.

What does this have to do with upgrading to Python 2.6?
It turns out that GeneratorExit is supposed to inherit from BaseException, just like KeyboardInterrupt and StopIteration, so that except Exception won’t catch it. However, in Python 2.5 it inherits from Exception, which means that the following sample code will behave strangely:

def func():
    with file('temp.txt','w') as f:
        for i in range(10):
            try:
                f.write('%d\n' % i)
                yield i
            except Exception:
                pass
 
In [13]: for i in func():
   ....:     print i
   ....:     if i == 5:
   ....:         break
   ....:
   ....:
0
1
2
3
4
5
Exception exceptions.RuntimeError: 'generator ignored GeneratorExit' in <generator object at 0x021BEAD0> ignored

The workaround is simple: before except Exception, catch GeneratorExit and handle it correctly. Or, upgrade to Python 2.6 instead :)

Posted in Programming, Python | Tagged , , , , | Leave a comment

Fast Peak Autocorrelation

So, I was at geekcon. It was a blast.
The lunar lander
There were many interesting projects, and I didn’t get to play with them all. I did get to work a bit on the Lunar Lander from last year, and this year it was finished successfully. My part was the PC game which interfaced with the microcontroller controlling the lander. As you probably guessed, it was written in Python.

This year, as promised, I worked on the Automatic Improviser. I worked on it with Ira Cherkes.

While the final version worked, it didn’t work well enough, and there is still much work to be done. Still, we had excellent progress.
By the way, I know this subject has been tackled before, and I still wanted to try it myself, without reading too much literature about it.

One of the components of the system is a beat recognizer. My idea to discover the beat is simple: find the envelope (similar to removing AM modulation), and then find the low “frequency” of the envelope.
Instead of doing a Fast Fourier Transform for beat recognition, we were advised that autocorellation will do the trick better and faster. However, when trying to autocorellate using scipy.signal.correlate we discovered that autocorellation was too slow for real time beat recognition, and certainly wasteful.

To solve this issue, we decided to first do peak detection on the envelope, and then autocorellate the peaks. Since there shouldn’t be too many peaks, this has the potential of being really quick. However, there was no standard function to do autocorellation of peaks, so we implemented it ourselves. We were pretty rushed, so we worked fast. Here’s the code:

def autocorrelate_peaks(peaks):
    peaks_dict = dict(peaks)
    indexes = set(peaks_dict.keys())
    deltas = set()
    for i in indexes:
        for j in indexes:
            if j>i:
                continue
            deltas.add(i-j)
 
    result = {}
    for d in deltas:
        moved = set(i+d for i in indexes)
        to_mult = moved & indexes
        assert to_mult <= indexes
        s = sum(peaks_dict[i-d]*peaks_dict[i] for i in to_mult)
        result[d] = s
    return result

This function takes as input a list of tuples, each of the form (peak_index, peak_value), and returns a mapping between non-zero corellation offsets and their values.
Here’s is a sample run:

In [7]: autocorrelate_peaks([(2, 2), (5,1), (7,3)])
Out[7]: {0: 14, 2: 3, 3: 2, 5: 6}

After implementing this function, our recognition loop was back to real-time, and we didn’t have to bother with optimizing again.

Posted in Algorithms, Math, Programming, Projects, Python, Utility Functions | Tagged , , , , , | Leave a comment

Preparing PyImprov for GeekCon on Friday

A long long time ago, I wrote Pytuner. It was one of the first projects I published on this website.
For a long time it just sat there, doing nothing, while the library it’s based on – PyMedia, wasn’t being maintained anymore, and PyTuner could only work on Python 2.4.

Enter GeekCon – GeekCon is a get together of people looking to work on their wildest fantasy projects – things that they don’t get do because of their regular work. Last yet I worked on a real life 3d lunar lander, and this year I thought I’d take the opportunity to work on PyImprov – my wild fantasy project.
The idea is simple – I’ll play some simple chords, and the script will improvise some blues solo. To that purpose, I wrote (also a long time ago) a chord recognizer. Now I’m missing a beat recognizer, and a simple improviser, which I plan to complete during the GeekCon weekend.

In preparation for the event, I’ve opened up a subversion repository for PyImprov on Assembla, and patched the scripts to work with PyAudio instead of PyMedia.

So, onwards to GeekCon, see you on the other side, with a guitar and a Python script in hand!

Posted in Programming, Projects, Python, Sound | Tagged , , , | 3 Comments