Categories
Challenges Programming Python Statistics

Small programming challenge No. 6 – nblocks

I came up with this challenge when I had to write a function to divide a sequence to percentiles. I needed this to calculate some statistics over trips for plnnr.com. “This sounds trivial” I thought, and reached for my simple blocks function:

def blocks(seq, block_len):
    """blocks(range(5),2) -> [[0, 1], [2, 3], [4]]"""
    seq_len = len(seq)
    if seq_len%block_len == 0:
        num_blocks = seq_len/block_len
    else:
        num_blocks = 1 + (seq_len/block_len)
 
    result = [[] for i in xrange(num_blocks)]
    for idx, obj in enumerate(seq):
        result[idx/block_len].append(obj)
    return result

So I set block_len to len(seq)/10 and called blocks(seq, block_len). Unfortunately, according to the docs of blocks (which I wrote…), when there is extra data, a “partial” block is added – which is exactly what we don’t want when calculating percentiles.
Instead, the behavior we want is nblocks(seq, number_of_blocks), for example: nblocks(range(10), 3) -> [0, 1, 2], [3, 4, 5], [6, 7, 8, 9].

This is a surprisingly hard to write function, or rather, harder than you’d expect. I’ll be especially glad if someone writes it elegantly. My solution works well enough, but isn’t the prettiest.

So, you have the definition – let’s see if you can do better than me. Once enough solutions are presented, I will present my own.

IMPORTANT EDIT: the required signature is nblocks(seq, num_blocks). So for seq(range(10), 3), the return value should be 3 blocks, with the last one having an extra item. As a general rule, the extra items should be spread as evenly as possible.

Categories
Projects Statistics

The Art and Science of Pulling Numbers Out of Your Sleeve

About a year or so ago, I was reading R. V. Jones’ excellent book ‘Most Secret War’. One of the stories I remembered and told my colleagues about, was how Jones estimated the rocket production capabilities of the Germans. He did so after looking at an aerial photograph of a rocket fuel shed. One of my colleague then told me of a course she took at the Hebrew University of Jerusalem that teaches how to make such estimates. I let it go at the time, and just remembered that there is such a course.

A few days ago, I met up with this colleague, and I was reminded of this course. You see, right now I’m collecting data for a project I’m doing with a friend, and I needed estimation know-how. So I asked her the name of the course, and she gave it to me. Two Google search later and I had the English name of the course, “Order of Magnitude in Physics Problems”, and a textbook to look at. I just finished reading the introduction, and I know that I’m going to read the rest of it too. Not just for my current project – but because it’s such a good skill to have.

The book opens with a problem:

We dedicate the first example to physicists who need employment outside of physics. […] How much money is there in a fully loaded Brinks armored car?

The book then goes to show how to answer such a question intelligently. I’m hooked.

Categories
computer science Math Programming Python Statistics Utility Functions

Solution for the Random Selection Challenge

A few days ago, I wrote up two small Python Challenges. Several people have presented solutions for the first challenge, and I also posted my solution in the comments there.

However, the second challenge remained unsolved, and I will present a solution for it in this post.