Python bug in __getslice__

While working with Python, a co-worker found a puzzling behavior… He couldn’t get correct slices on an object he created. After puzzling over the problem, we found out the problem – __getslice__ seems to change its arguments: in a 32 bit argument, when the MSB is set, the argument is changed to 0x7FFFFFFF. After looking it up, it turned out that the bug was also present in the latest Python 2.5. It also turned out that __getslice__ is deprecated. Here is the link to the bug info. This was the first time I got to submit a Python bug. Kind of a mixed feeling.

[Update: This bug was closed. You can still read about it in the original bug info.]


Fantasy books for programmers

I read these two books some years ago, before I was in high school. The concept really appealed to me. For fear of spoilers, I will not say much, suffice it to say reading the books gave me a different perspective on certain things, and some new ideas. The books themselves are not really complicated, don’t have the best characters around, but are a pretty solid read.  When I found them online some time ago, it took me just a few hours to go through one of them.

I recommend.

And I almost forgot… The links. Have fun.

Algorithms Programming Philosophy Python

Proofreading and what's wrong with the heapq module

Consider the following problem:

you have a large book you would like to proofread, with many chapters (100+) and a few men (4) at your disposal. How would you distribute the chapters among the men, considering that each proofreader must get whole chapters?

Well, the obvious solution is just divide the number of chapters by the number of men, and give each proofreader an appropriate number of (randomly picked?) chapters. But what if the chapters or of varying length? Well, you could just distribute them randomly, but that just doesn’t feel right now does it?

I was asked by a friend to help him write a script to solve this, and quite quickly I came up with the solution:

Sort the chapters according to length (number of words), in descending order. Keep the proofreaders in a sorted order. While there are still chapters left – get the next longest chapter, and give it to the proofreader with the least total length. Rearrange the proofreaders.

This algorithm is a greedy algorithm and is also regarded as the straightforward way to solve this problem.

Well, this seems simple enough – In the case where there are more then a few proofreaders – well, the obvious solution is to use a minimum heap. Using a minimum heap in python should be easy – just import the heapq module, use heapify and various other functions, just as you would use sort, and you are home free.

Not quite…

Turns out, that using the heapq module isn’t so easy as it seems, or at least not as quick and simple as I would like it to be. While you can sort list using your own compare function, and also providing a key-function, the heapq module sorts items using Python’s standard operators. This means, that to use heapq properly to sort an object according to some property, you have to write an object wrapper that has __lt__ and similar functions defined. All this coding could be saved, if heapq had a key argument that could be used, or any other method to control the comparisons.

And what about the proofreading you ask? Well, it worked. It divided the chapters quite evenly, although we did end up using sort() repeatedly, as the number of proofreaders was small and we did not want to overly complicate matters.

This again emphasizes a something I’ve come to discover – being able to write something quickly is usually more important then later being able to run it quickly. If you write something in an environment that allows for quick coding – later if speed is required you’ll be able to optimize it quickly, or even change the design or the basic algorithm.

Algorithms Math Programming Python

Interesting links – 3

How to Write a Spelling Corrector

This was a very interesting article, with some interesting python code, and a good mathematical explanation. I enjoyed reading it. I got to it through, which appeared on Daily-Python.

Robert Lang’s Origami Page

Fascinating page. A few months ago I was getting back to doing some Origami (it is a good relaxation method, also helps your wrists after typing too much :). I was trying to find ways to create origami creatures with more then two limbs, a head, and a tail, and I was fascinated by pictures of origami foldings of intricate creatures with many limbs, such as spiders and dinosaurs. I was also trying to formulate for myself some of the mathematical laws for origami. So I started to look into the subject and found out about crease patterns, and among many other sites, I got to Robert Lang’s site. The subjects he writes about are very diverse – and include using ‘origami knowledge’ to fold airbags, a program to create complicated crease patterns, an origami simulator (something I wanted to write myself :), and so on. Really, a fascinating read.

Algorithms Math

Interesting Links – 2


I first saw this term on The Daily WTF, in “Inner Platform Effect”. I quote: “They describe a frequently repeated problem in designing a commonly-occurring solution”. Recently I came upon them while going over wikipedia, and found that Anit-Patterns could be actually used as a teaching tool. Some patterns that were an interesting read for me are “Database as an IPC” and “Lava Flow”, mostly because I came close to them. It is interesting to note that programming is about naming things. When you successfully name a concept – it probably means you understand it. Going over this list might give you a frightning feeling of deja-vu. So read and take heed.

Catching Lions

Well, this is an old one… Not many people know, but the first mention of the subject was in the article by H. Petard, “A Contribution to the Mathematical Theory of Big Game Hunting”, that appeared in 1938. Since then many more articles were written on the subject. It is always amusing to note that the regular algorithm for searching a sorted array is often-times called “lion in the desert”, only because of this article. I actually wanted to look it up just because I came across this link, and near it were other old jokes, one of them being this old subject. This is usually regarded as the worst kind of humour :)

Programming Philosophy

Zen Programming – 1

To achieve correctness of code, you must first realize that your code is incorrect.

Too many times I’ve reviewed code that was written with only the obvious cases in mind. A small step towards writing better code is to think to yourself during (and even before!) coding: “what if?”.


Interesting Links – 1

Parallel Python

I’ve often desired a good transparent solution for “automated” distributed programming. This module seems to be quite useful a solution. With automatic detection of the number of available processors, works over networks and the usual Python ease of use, seems like a really good candidate.

Better Programming By Programming Better

That’s an interesting post, with even more links to other good reading material. I’ve often found that a good programmer is one that programs at home, out of interest, fun or whatnot. It is the kind of guy that will get a good feeling from finally getting some program to work and seeing the results, or from writing a very elegant piece of code, or maybe writing some really optimized assembly chunk. For me, it is usually just getting something to work. When I see something that I wrote produce a tangible result – well, this is one of the best. As a result of that, when I try to estimate how serious a programmer someone is, I usually ask, “What is your home project?”, or “What did you enjoy writing?”.

Bit Twiddling

A lot of games with bits. Wholesome fun for you and the family!



Default argument values in python are usually used in the following fashion:

def func(foo, bar=4):
    return foo*bar

However, if the default value for the function is a mutable object (such as a list), changing it will cause subsequent calls to the function behave differently. This behavior produced the following idiom:

def foo(arg = None):
    if arg is None:
        arg = []
    #modify arg...
    return arg

This idiom works quite well, unless None is a legitimate value for arg. consider the follwing:

class Vehicle(object):
    #other code goes here...
    def is_car(self, car_type = None):
        if car_type is None:
            return self._type == CAR
        return self._type == CAR and self.value == car_type

The method is_car has two different behaviors: If called without car_type, it will check whether self has _type CAR. Otherwise, it will also check that self has a value of the specific car_type. This will work quite well, but if None is a legitimate car_type, checking against it will produce surprising results – it will only check if self has _type CAR. To avoid this kind of bug, I propose using DefaultArg, a new singleton that will signify that a given argument is “the default argument” for the function. Here is the previous example, now using DefaultArg:

class Vehicle(object):
    #other code goes here...
    def is_car(self, car_type = DefaultArg):
        if car_type is DefaultArg:
            return self._type == CAR
        return self._type == CAR and self.value == car_type

This makes sense, because one should never set a variable to DefaultArg (unless through a function call). Also, tools like PyLint can easily check for this kind of mistakes. Note that DefaultArg addresses a specific problem – when the default argument changes the behavior of the function and that None is also a legitimate value for that argument. I would also like to note that DefaultArg was the solution for a real bug I encountred – and now I it use regularly.

Here is a simple implementation, using Daniel Brodie’s Singleton recipe:

class Singleton(type):
    def __init__(self, *args):
        type.__init__(self, *args)
        self._instances = {}

    def __call__(self, *args):
        if not args in self._instances:
            self._instances[args] = type.__call__(self, *args)
        return self._instances[args]

class _DefaultArg(object):
    def __init__(self, *args): pass

    def __eq__(self, other):
        return isinstance(other, _DefaultArg)

    def __ne__(self, other):
        return not self == other

DefaultArg = _DefaultArg()

For the interested reader – Tomer Filiba proposed a different solution (to a slightly different problem I belive), using the “copydefaults” decorator.


New Blog

So I decided to start this new blog. Let’s see where it goes.