• Skip to primary navigation
  • Skip to main content

Algorithm.co.il

  • About
  • Best Posts
  • Origami
  • Older Projects

Small Python Utility Functions

Posted on 2007-09-05 by lorg 3 Comments

While working with Gil Dabach on Distorm3, I found out that I’ve been missing a lot of utility functions. I’m going to write about some of them now.

def classify_to_dict(seq, key_func):
    result = {}
    for item in seq:
        key = key_func(item)
        if key in result:
            result[key].append(item)
        else:
            result[key] = [item]
    return result

def classify_to_dict(seq, key_func): result = {} for item in seq: key = key_func(item) if key in result: result[key].append(item) else: result[key] = [item] return result

This is really a simple function, but a very powerful one. Here is a simple demonstration:

>>> base_tools.classify_to_dict(range(5), lambda k: k%2)
{0: [0, 2, 4], 1: [1, 3]}

>>> base_tools.classify_to_dict(range(5), lambda k: k%2) {0: [0, 2, 4], 1: [1, 3]}

Note the similarity and difference from groupby.
If any of you can suggest a better name then classify_to_dict (maybe just classify?), I’ll be happy to hear it.

Some other very useful functions include arg_min and arg_max, which respectively return the index of the maximum and minimum element in a sequence. I also like to use union and intersection, which behave just like sum, but instead of using the + operator, use the | and & operators respectively. This is most useful for sets, and on some rare occasions, for numbers.

Another function I like (but rarely use) is unzip. I know, I know, paddy mentioned it is pretty obvious, and I know that it could just as well be called transpose, however, I still find using unzip(seq) a better choice then the much less obvious zip(*seq). Readability counts.

What are your favorite utility functions?

A minor update: I’ve incorporated the use of a syntax highlighter now, and it should be enabled for comments as well by the next challenge (which will be soon enough).

Filed under: Programming, Python, Utility Functions

Reader Interactions

Comments

  1. Erez says

    2007-09-06 at 3:56 pm

    I would choose index_of_max over arg_max. I see no connection to arguments. Same with min.

    I think you could pick a better example to show the strength of classify_to_dict.
    Like:
    >>> classify_to_dict( [“a”,”bcd”,”def”,”qwerty”], len )
    {1: [‘a’], 3: [‘bcd’, ‘def’], 6: [‘qwerty’]}

    Or:
    >>> classify_to_dict( [1, ‘hello’, (2,3)], type )
    {: [1], : [‘hello’], : [(2, 3)]}

    Also, I think implementation would be better if you caught KeyError instead of checking __contains__.

    Reply
  2. Erez says

    2007-09-06 at 3:57 pm

    Heh, ruined by the use of HTML symbols. Just run the last example and see :-)

    Reply
  3. lorg says

    2007-09-06 at 5:34 pm

    You are right that for most people index_of_max is better and more meaningful then arg_max. arg_max however is the name of the mathematical function that does the same thing. Readability counts – you win :)

    Reply

Leave a ReplyCancel reply

© 2023 Algorithm.co.il - Algorithms, for the heck of it