Categories

## Python bug in __getslice__

While working with Python, a co-worker found a puzzling behavior… He couldn’t get correct slices on an object he created. After puzzling over the problem, we found out the problem – __getslice__ seems to change its arguments: in a 32 bit argument, when the MSB is set, the argument is changed to 0x7FFFFFFF. After looking it up, it turned out that the bug was also present in the latest Python 2.5. It also turned out that __getslice__ is deprecated. Here is the link to the bug info. This was the first time I got to submit a Python bug. Kind of a mixed feeling.

[Update: This bug was closed. You can still read about it in the original bug info.]

Categories

## Proofreading and what's wrong with the heapq module

Consider the following problem:

you have a large book you would like to proofread, with many chapters (100+) and a few men (4) at your disposal. How would you distribute the chapters among the men, considering that each proofreader must get whole chapters?

Well, the obvious solution is just divide the number of chapters by the number of men, and give each proofreader an appropriate number of (randomly picked?) chapters. But what if the chapters or of varying length? Well, you could just distribute them randomly, but that just doesn’t feel right now does it?

I was asked by a friend to help him write a script to solve this, and quite quickly I came up with the solution:

Sort the chapters according to length (number of words), in descending order. Keep the proofreaders in a sorted order. While there are still chapters left – get the next longest chapter, and give it to the proofreader with the least total length. Rearrange the proofreaders.

This algorithm is a greedy algorithm and is also regarded as the straightforward way to solve this problem.

Well, this seems simple enough – In the case where there are more then a few proofreaders – well, the obvious solution is to use a minimum heap. Using a minimum heap in python should be easy – just import the heapq module, use heapify and various other functions, just as you would use sort, and you are home free.

Not quite…

Turns out, that using the heapq module isn’t so easy as it seems, or at least not as quick and simple as I would like it to be. While you can sort list using your own compare function, and also providing a key-function, the heapq module sorts items using Python’s standard operators. This means, that to use heapq properly to sort an object according to some property, you have to write an object wrapper that has __lt__ and similar functions defined. All this coding could be saved, if heapq had a key argument that could be used, or any other method to control the comparisons.

And what about the proofreading you ask? Well, it worked. It divided the chapters quite evenly, although we did end up using sort() repeatedly, as the number of proofreaders was small and we did not want to overly complicate matters.

This again emphasizes a something I’ve come to discover – being able to write something quickly is usually more important then later being able to run it quickly. If you write something in an environment that allows for quick coding – later if speed is required you’ll be able to optimize it quickly, or even change the design or the basic algorithm.

Categories

How to Write a Spelling Corrector

This was a very interesting article, with some interesting python code, and a good mathematical explanation. I enjoyed reading it. I got to it through Peterbe.com, which appeared on Daily-Python.

Robert Lang’s Origami Page

Fascinating page. A few months ago I was getting back to doing some Origami (it is a good relaxation method, also helps your wrists after typing too much :). I was trying to find ways to create origami creatures with more then two limbs, a head, and a tail, and I was fascinated by pictures of origami foldings of intricate creatures with many limbs, such as spiders and dinosaurs. I was also trying to formulate for myself some of the mathematical laws for origami. So I started to look into the subject and found out about crease patterns, and among many other sites, I got to Robert Lang’s site. The subjects he writes about are very diverse – and include using ‘origami knowledge’ to fold airbags, a program to create complicated crease patterns, an origami simulator (something I wanted to write myself :), and so on. Really, a fascinating read.

Categories

Parallel Python

I’ve often desired a good transparent solution for “automated” distributed programming. This module seems to be quite useful a solution. With automatic detection of the number of available processors, works over networks and the usual Python ease of use, seems like a really good candidate.

Better Programming By Programming Better

That’s an interesting post, with even more links to other good reading material. I’ve often found that a good programmer is one that programs at home, out of interest, fun or whatnot. It is the kind of guy that will get a good feeling from finally getting some program to work and seeing the results, or from writing a very elegant piece of code, or maybe writing some really optimized assembly chunk. For me, it is usually just getting something to work. When I see something that I wrote produce a tangible result – well, this is one of the best. As a result of that, when I try to estimate how serious a programmer someone is, I usually ask, “What is your home project?”, or “What did you enjoy writing?”.

Bit Twiddling

A lot of games with bits. Wholesome fun for you and the family!

Categories

## DefaultArg

Default argument values in python are usually used in the following fashion:

```def func(foo, bar=4):
return foo*bar```

However, if the default value for the function is a mutable object (such as a list), changing it will cause subsequent calls to the function behave differently. This behavior produced the following idiom:

```def foo(arg = None):
if arg is None:
arg = []
#modify arg...
return arg```

This idiom works quite well, unless None is a legitimate value for `arg`. consider the follwing:

```class Vehicle(object):
#other code goes here...
def is_car(self, car_type = None):
if car_type is None:
return self._type == CAR
return self._type == CAR and self.value == car_type```

The method is_car has two different behaviors: If called without car_type, it will check whether self has _type CAR. Otherwise, it will also check that self has a value of the specific car_type. This will work quite well, but if None is a legitimate car_type, checking against it will produce surprising results – it will only check if self has _type CAR. To avoid this kind of bug, I propose using DefaultArg, a new singleton that will signify that a given argument is “the default argument” for the function. Here is the previous example, now using DefaultArg:

```class Vehicle(object):
#other code goes here...
def is_car(self, car_type = DefaultArg):
if car_type is DefaultArg:
return self._type == CAR
return self._type == CAR and self.value == car_type```

This makes sense, because one should never set a variable to DefaultArg (unless through a function call). Also, tools like PyLint can easily check for this kind of mistakes. Note that DefaultArg addresses a specific problem – when the default argument changes the behavior of the function and that None is also a legitimate value for that argument. I would also like to note that DefaultArg was the solution for a real bug I encountred – and now I it use regularly.

Here is a simple implementation, using Daniel Brodie’s Singleton recipe:

```class Singleton(type):
def __init__(self, *args):
type.__init__(self, *args)
self._instances = {}

def __call__(self, *args):
if not args in self._instances:
self._instances[args] = type.__call__(self, *args)
return self._instances[args]

class _DefaultArg(object):
__metaclass__=Singleton
def __init__(self, *args): pass

def __eq__(self, other):
return isinstance(other, _DefaultArg)

def __ne__(self, other):
return not self == other

DefaultArg = _DefaultArg()```

For the interested reader – Tomer Filiba proposed a different solution (to a slightly different problem I belive), using the “copydefaults” decorator.