Categories

10 Awesome Theorems & Results

When I look back at various mathematical courses I took, most have at least one theorem that I really liked. Usually I like it because the proof has a surprising trick, sometimes it’s because of the unexpected conclusion, or maybe the unintuitive feel to it. In other cases it’s just the elegance of the proof, or the result itself.
Without further ado, here’s a selection of my favorite theorems, in no particular order:

1. Linear Algebra: the Cayley Hamilton theorem. When I first grokked the fact that you can substitute matrices for the variables in polynomials, I was awestruck. Then I learned that you can define eA by using a Taylor series, but the fun doesn’t stop there. Using the Eigenvalues you can greatly simplify the calculation, and it all “works out the same” (i.e., if A=P-1DP and D is diagonal, then p(A) = P-1p(D)P. This works also for Jordan forms). Also, since you can show that complex numbers are isomorphic to the 2×2 matrices of the form [[a, b], [-b, a]], and that the calculations were exactly the same – well, everything “fell into place for me”. At the time it seemed to be one of the joys of Mathematics.

2. Calculus: the Bolzano-Weierstrass Theorem. One of the first non trivial results you learn in calculus, I originally learned the version that says: “Every bounded infinite set has a limit point”, and its proof was a bit more elegant in my eyes than the proof of the Wikipedia version. I liked it so much that one time when I was in boot camp in the service, I worked it out again just to keep my mind working. Good times.

3. Probability: The elegant result of V(x) = E(V(x|y)) + V(E(x|y)). Just the sight of it makes one sigh with contentedness, and the result itself is very nice.

4. Calculus, again: Stokes’ theorem and its friends. Very useful and non intuitive, in layman’s terms it says that you can reason about what happens in an area just by knowing about its perimeter.

5. Numerical Analysis: Richardson Extrapolation: one of the most elegant forms of bootstrapping, you start with a simple approximation method as a building block, and at the end you get a very strong high-quality approximation.

6. Computability: The Parameter theorem. Especially elegant, it basically gives the mathematical definition of the “bind” function for function parameters. In simple terms it uses the source code of a function f(x, y), to find the source code of a function g(y) such that g(y) = f(a, y) for some a. The nice thing about it is that it works only on source code, without calling the function themselves.
This theorem had the added bonus that once I grokked it, the test in computability was very, very easy :)

7. Functional analysis: here it’s a relatively minor result that I ended up remembering distinctly: Given z1.. zn which are linearly independent in E, show that there exists a d such that for each w1…wn that follow ||wi – zi|| < d for each i, are also linearly independent. The footnote says that such a finite, linearly independent group is called stable. When visualizing I think of it this way: given a such a group, kick it. As long as you don’t kick it too strongly – it will stay linearly independent. Now that’s stable.

8. Mathematical Logic: The Compactness theorem: “a set of first-order sentences has a model if and only if every finite subset of it has a model”. One direction is almost trivial, but the other is deep. When studying for the test in this course, I remember being stuck for days on an exercise that required the use of this theorem. Once I fully understood the method of its use, it became a favorite.
(By the way, the exercise was the following: Let G a countable group of first order statements, and p a first order statement. Show that if p is true in every countable model of G, than G |= p.)

9. Cryptography: I’ve learned a bit of cryptography on my own before taking the cryptography course. When I did though, two methods were especially memorable: The first was the “Meet in the Middle” attack. Not to be confused with “Man in the Middle”, this method allows one to attack symmetric ciphers constructed by repeatedly applying a simpler cipher. This known plaintext attack got its name from its method of operation: the attacker calculates all possible decryptions the ciphertext and stores them in a lookup table. Then, he calculates all encryptions of the plaintext and looks them up in that lookup table. Once a result is found – the combination of the encryption and the decryption keys used is the final key of the composed cipher.

10. The second cryptography result that I liked was secret sharing. Trivial secret sharing is so simple, and yet effective, that when I first learned it I thought: “how come I didn’t think of this before?”.

There are obviously many more elegant theorems, some of which I’ve learned in my studies. I sure hope to learn a few more. Still, these are special. As a highschool math teacher once told us about the Pythagorean theorem: “I want you to remember the proof even if I wake you in the middle of the night”. The theorems in this short list come close to that ideal.

Now I wonder – what are your favorite theorems?

Categories

Visualizing Data Using the Hilbert Curve

Some time ago, a coworker asked me to help him visualize some data. He had a very long series (many millions) of data points, and he thought that plotting a pixel for each one would visualize it well, so he asked for my help.
I installed Python & PIL on his machine, and not too long after, he had the image plotted. The script looked something like:

```data_points = get_data_points() n = int((len(data_points)**0.5) + 0.5)   image = Image('1', (n, n)) for idx, pt in enumerate(data_points): image.putpixel(pt, (idx/n, idx%n)) image.save('bla.png', 'png')```

Easy enough to do. Well, easy enough if you have enough memory to handle very large data sets. Luckily enough, we had just enough memory for this script & data series, and we were happy. The image was generated, and everything worked fine.

Still, we wanted to improve on that. One problem with this visualization is that two horizontally adjacent pixels don’t have anything to do with each other. Remembering xkcd’s “Map of the Internet“, I decided to use the Hilbert Curve. I started with wikipedia’s version of the code for the Python turtle and changed it to generate a string of instructions of where to put pixels. On the way I improved the time complexity by changing it to have only two recursion calls instead of four. (It can probably be taken down to one by the way, I leave that as a challenge to the reader :)

Unfortunately, at this point we didn’t have enough memory to hold all of those instructions, so I changed it into a generator. Now it was too slow. I cached the lower levels of the recursion, and now it worked in reasonable time (about 3-5 minutes), with reasonable memory requirements (no OutOfMemory exceptions). Of course, I’m skipping a bit of exceptions and debugging along the way. Still, it was relatively straightforward.

Writing the generator wasn’t enough – there were still pixels to draw! It took a few more minutes to write a simple “turtle”, that walks the generated hilbert curve.

```hilbert = Hilbert(int(math.log(len(data_points), 4) + 0.5)) for pt in data_points: x,y = hilbert.next() image.putpixel(pt, (x,y))```

A few minutes later, the image was generated.

Categories

Fast Peak Autocorrelation

So, I was at geekcon. It was a blast.

There were many interesting projects, and I didn’t get to play with them all. I did get to work a bit on the Lunar Lander from last year, and this year it was finished successfully. My part was the PC game which interfaced with the microcontroller controlling the lander. As you probably guessed, it was written in Python.

This year, as promised, I worked on the Automatic Improviser. I worked on it with Ira Cherkes.

While the final version worked, it didn’t work well enough, and there is still much work to be done. Still, we had excellent progress.
By the way, I know this subject has been tackled before, and I still wanted to try it myself, without reading too much literature about it.

One of the components of the system is a beat recognizer. My idea to discover the beat is simple: find the envelope (similar to removing AM modulation), and then find the low “frequency” of the envelope.
Instead of doing a Fast Fourier Transform for beat recognition, we were advised that autocorellation will do the trick better and faster. However, when trying to autocorellate using scipy.signal.correlate we discovered that autocorellation was too slow for real time beat recognition, and certainly wasteful.

To solve this issue, we decided to first do peak detection on the envelope, and then autocorellate the peaks. Since there shouldn’t be too many peaks, this has the potential of being really quick. However, there was no standard function to do autocorellation of peaks, so we implemented it ourselves. We were pretty rushed, so we worked fast. Here’s the code:

```def autocorrelate_peaks(peaks): peaks_dict = dict(peaks) indexes = set(peaks_dict.keys()) deltas = set() for i in indexes: for j in indexes: if j>i: continue deltas.add(i-j)   result = {} for d in deltas: moved = set(i+d for i in indexes) to_mult = moved & indexes assert to_mult <= indexes s = sum(peaks_dict[i-d]*peaks_dict[i] for i in to_mult) result[d] = s return result```

This function takes as input a list of tuples, each of the form (peak_index, peak_value), and returns a mapping between non-zero corellation offsets and their values.
Here’s is a sample run:

```In [7]: autocorrelate_peaks([(2, 2), (5,1), (7,3)]) Out[7]: {0: 14, 2: 3, 3: 2, 5: 6}```

After implementing this function, our recognition loop was back to real-time, and we didn’t have to bother with optimizing again.

Categories

Checking the ulam spiral

In the following post, the Ulam spiral is described. It’s a very simple object – write down consecutive natural numbers starting from 41 in a square spiral. Curiously, the numbers on the diagonal are primes:

Reading this post, I immediately wanted to check how long this property holds. The original blog post suggests:

Remarkably, all the numbers on the red diagonal are prime — even when the spiral is continued into a 20 × 20 square.

Yet it doesn’t mention if it stops there or not. Without peeking (in Wikipedia), I wrote a simple program to check:

```import sys import psyco psyco.full()   UP = 0 LEFT = 1 DOWN = 2 RIGHT = 3   DIRECTIONS = {UP: (0, -1), LEFT: (-1, 0), DOWN: (0, 1), RIGHT: (1, 0)}   def generate_ulam_seq(n): start = 41 x = 0 y = 0 min_x = 0 max_x = 0 min_y = 0 max_y = 0 value = start direction = UP for i in xrange(n): yield x, y, value value += 1 add_x, add_y = DIRECTIONS[direction] x, y = x + add_x, y + add_y if x < min_x: direction = (direction+1) % 4 min_x = x if x > max_x: direction = (direction+1) % 4 max_x = x if y < min_y: direction = (direction+1) % 4 min_y = y if y > max_y: direction = (direction+1) % 4 max_y = y   def is_prime(n): return all(n%x != 0 for x in xrange(2, int((n**0.5) + 1)))   def main(): for x, y, v in generate_ulam_seq(int(sys.argv[1])): if x == y and not is_prime(v): print x, y, v   if __name__ == '__main__': main()```

Running it, we get:

```> ulam.py 10000 20 20 1681 -21 -21 1763 22 22 2021 -25 -25 2491 28 28 3233 -33 -33 4331 38 38 5893 -41 -41 6683 41 41 6847 42 42 7181 -44 -44 7697 -45 -45 8051 -46 -46 8413 48 48 9353```

So the property doesn’t hold for a very long time. Reading up on Wikipedia, it seems the Ulam spiral still has interesting properties related to primes in higher sizes. Maybe I’ll plot that one as well in a future post.

PS: Regarding primality testing, this is a cute, quick&dirty one liner. In the range of numbers we’re discussing, it is ‘good enough’.

Categories

Fractals in 10 minutes No. 6: Turtle Snowflake

I didn’t write this one, but I found it’s simplicity and availability so compelling, I couldn’t just not write about it.
In a reddit post from a while ago, some commenter named jfedor left the following comment:

A little known fact is that you can do the following on any standard Python installation:

```from turtle import *   def f(length, depth): if depth == 0: forward(length) else: f(length/3, depth-1) right(60) f(length/3, depth-1) left(120) f(length/3, depth-1) right(60) f(length/3, depth-1)   f(500, 4)```

If you copy paste, it’s a fractal in less than a minute. If you type it yourself, it’s still less than 10. And it’s something you can show a kid. I really liked this one.

Categories

Computing Large Determinants in Python

Story:
For my seminar work, I had to calculate the determinant of a large integer matrix.
In this case, large meant n>=500. You might say that this isn’t very large and I would agree. However, it is large enough to overflow a float and reach +inf. Reaching +inf is bad: once you do, you lose all the accuracy of the computation. Also, since determinant calculations use addition as well as subtraction, if +inf is reached in the middle of the calculation the value might never drop back, even if the determinant is a “small” number.

(Note: Even if the calculation hadn’t reached +inf, but just some very large floating point number, the same phenomenon could have occurred.)

As I said, my matrix was composed of integers. Since a determinant is a sum of multiples of matrix elements, you would expect such a determinant to be an integer as well. It would be nice to take advantage of that fact.

How should one compute such a determinant? Well, luckily Python has support for arbitrarily large integers. Let’s compute the determinant with them!
Unfortunately, Python’s numpy computes determinants using LU decomposition, which uses division – hence, floating point values.
Well, never mind, I know how to calculate a determinant, right?
Ten minutes later the naive determinant calculation by minors was implemented, with one “minor” setback – it takes O(n!) time.
Googling for integer determinants yielded some articles about division free algorithms, which weren’t really easy to implement. There was one suggestion about using Rational numbers and Gauss Elimination, with pivots chosen to tame the growth of the nominator and denominator.

So, I hitched up a rational number class, using Python’s arbitrarily large integers as nominator and denominator. Then, I got some code to do Gauss Elimination, although my pivoting wasn’t the most fancy. This seemed to work, and I got my desired large determinants.

Epilogue:
Not too long ago, I ran across mpmath, a Python library for arbitrarily large floating point numbers. It is possible I could have used it instead of my own rational numbers. Next time I run into a relevant problem I’ll keep this library in mind.
Also, an even shorter while ago, I became aware that since 2.6 Python boasts a fractions module. This will sure come in handy in the future.

Code:
Available here. If you intend to use it in Python 2.6 and above, I suggest replacing my Rational class with Python’s Fraction.
(Note that I adapted this module to my needs. Therefore it has a timer that prints time estimations for computations, and it uses Psyco.)

Categories

Fractals in 10 minutes No. 5: Sierpinski Chaos Game

A few years back, I was in some course where they also taught me some Matlab. One of the exercises was to draw the Sierpinski triangle using a method of progressing a point randomly. I was quite surprised at the time, because I thought the Sierpinski triangle was more of an analytical thing: you drew it using inverted triangles.

I wanted to check up on it, and it turns out the method has a name: Chaos Game.
To generate a Sierpinski triangle using this method, one starts with a some point inside the triangle. Then, at each step, the next point is half the distance from the current point to one of the corners selected at random. I think this method of generating the Sierpinski triangle is even easier than the analytical one.

I used pylab (matplotlib) to create this image.
As I usually do, I also wanted to draw it using ascii-art. However, I must confess, I am not satisfied with the result:

```              %6
6*%8
%8  66
%6%*6%%6
6%'    '66
6866'  '6%86
%6* 86 '%8 *66
%**%%'***%6***%%
8'6            6*8
6%%*8          %*'*6
6%6'666        86# 86%
%%*''**%6      %**''*'%6
*%8'   '6%6    668'   '#%%
%% '6' '#*'6%  *% %8' '6''6*
%68%#6#*#68*86*%6#%8%8'6%6*6%*
```

The code isn’t really good, as I didn’t put much thought into it and just hacked it up. Still, I’m putting it up, and I might improve it someday.

Categories

Fractals in 10 minutes No. 4: Mandelbrot and Julia in Ascii Art

I felt like it’s about time I tackled the Mandelbrot and Julia sets, in Ascii-Art. Heck, the Mandelbrot fractal is on the logo of this Blog! However, being the most well-known fractal, this issue was tackled already, with satisfactory results.

Still, I took the ten minutes required to draw them in Python, using numpy:

```def get_mandelbrot(x, num_iter = 10): c = x for i in range(num_iter): x = x**2 + c return abs(x)>2   def get_julia(x, c, num_iter = 10): for i in range(num_iter): x = x**2 + c return abs(x)>2```

“Hey!” you might say, “There’s no loop in here!”. Indeed, with numpy loops can sometimes be avoided, when using arrays. When the expression x**2 + c is applied to an array x, it is applied element-wise, allowing for implicit loops. The actual magic happens in the following function:

```def get_board(bottom_left, top_right, num_x, num_y): x0, y0 = bottom_left x1, y1 = top_right x_values = numpy.arange(x0, x1, (x1-x0)/float(num_x)) y_values = numpy.arange(y0, y1, (y1-y0)/float(num_y)) return numpy.array([[x+1j*y for x in x_values] for y in y_values])```

The result of get_board will be used as the “x input” later on. It should be noted though that while that’s a cute trick this time, it might grow unhandy for more complicated computations. For example, making each element to reflect the iteration number on which it “escaped”.
So, here are the results:

```############################   ##############
###########################   ###############
##########################     ##############
#####################                ########
####################                #########
############ #######                 ########
############                         ########
#########                           #########
#########                           #########
############                         ########
############ #######                 ########
####################                #########
#####################                ########
##########################     ##############
###########################   ###############

############################################################
############################################################
############################################################
############################ ###############################
######################### ##   #############################
############## #######           ###########################
######## ##      ###              ##########  ##############
#########  ###        ###         ######       #############
##############       ######         ###        ###  ########
###############  ##########              ###      ## #######
############################           ####### #############
##############################   ## ########################
################################ ###########################
############################################################
############################################################
```

Here’s the code. (With the usual bsd-license slapped on top of it :)

Categories

Fractal Memory Usage and a Big Number

In a previous post I said I’d talk about 4**(4**(4**4)).
First, about the number. I first saw it mentioned in a math lesson back in high-school, when the teacher wanted to demonstrate estimation abilities.
He wrote it on the board, like so:

4444

and then asked students to estimate how big that number is. Turns out this number is much bigger than most students thought. (For comparison, the number of atoms in the observable universe is 1080.)

Given that this number is so big, it should come as no surprise that computing it will consume all available memory.
What is interesting is the way the memory is consumed:

(It died with a MemoryError just before the next “jump”.)

When I first saw this I was fascinated. To those interested it is generated by the long pow function in Python, implemented in longobject.c. According to the comment there, the algorithm is a left to right 5-ary exponentiation, taken from “Handbook of Applied Cryptography”.
In short, the simpler algorithm “left-to-right binary exponentiation” (used for smaller exponents) repeatedly squares the accumulator, and according to the binary digits of the exponents, multiplies it by the base. 5-ary exponentiation basically does the same thing, but in each iteration it ‘eats’ five digits of the exponent instead of just one.

It might be very interesting to study algorithms and code according to their memory usage. Once at work I used a similar memory usage graph to optimize the memory usage of some algorithm that was eating too much memory. I don’t remember the details as it happened a few years ago, but I do remember that I recognized the “type of the memory eating”, which helped me to narrow the problem down to a specific area of code, where indeed the problem was located. (I consider this a case of programming-fu :)

Notes:
1) The reason I’m talking about Python is because it has arbitrarily sized longs built-in.
2) This graph was generated using Windows’ task manager. On my Ubuntu machine it looks quite different, almost a straight line. Any guesses why?

Categories

PyKoan – The Logic Game

As you can probably tell, I’m back from my undeclared hiatus. I’ve got lots of stuff to talk about, and I’ll be starting with PyKoan, one small project I’ve been working on lately in my spare time.

A few weeks ago I stumbled upon an article in wikipedia, regarding a logic game. This game fascinated me, especially because of the “Godel Escher Bach” connection. Quoting from Wikipedia:

Zendo is a game of inductive logic designed by Kory Heath in which one player (the “Master”) creates a rule for structures (“koans”) to follow, and the other players (the “Students”) try to discover it by building and studying various koans which follow or break the rule. The first student to correctly state the rule wins.

As it happens I’m also taking a mathematical logic course this semester. Having read about the game, I wanted to write a similar computer game. Hence – PyKoan.

PyKoan is a game where the goal is to discover some logical rule, for example, “For each x holds x%2 == 0”. This rule is applied to a koan – a list of integers. An example koan that “has Buddha nature” (follows the rule) is [0,2,8]. One which doesn’t is [1].

To implement this idea, I wrote an implementation of an expression tree very similar to the one used in vial, but with a different implemented language. I also did some experimentation with the design. Since I’ve been talking a lot about expression trees without giving a full explanation, in a future post I’ll write about the implementation used in PyKoan.

So far I didn’t code a lot of the game, just the expression tree framework, and a simple rule builder. When using Python’s interactive prompt, one can get a taste of how the game might feel:

```In [19]: r = rulegen.create_rule(rulegen.easy_grammer, rulegen.easy_grammer_start)   In [20]: r.eval([]) Out[20]: False   In [21]: r.eval([0]) Out[21]: False   In [22]: r.eval([1]) Out[22]: False   In [23]: r.eval([2]) Out[23]: True   In [24]: r.eval([2,2]) Out[24]: True   In [25]: r.eval([2,2,3]) Out[25]: True   In [26]: r.eval([3]) Out[26]: False   In [27]: r.eval([4]) Out[27]: False   In [28]: print r x exists such that x == 2```

Here is how I would generate such an expression manually:

```In [3]: from exptree import exps   In [4]: exps.Exists('x',exps.Eq('x',2)) Out[4]: exps.Exists(exps.Var(x), exps.Eq(exps.Var(x), exps.Imm(2)))   In [5]: print _ x exists such that x == 2```

The game has many interesting possibilities for research, for example, computer players. Other possibilities include “just” guessing koans (not the rule itself), creating interesting and playable rules, and so on. There’s a lot to do.

This time, instead of just publishing the (unfinished) code, I decided to do something different. I’ve opened a space in assembla, with public read access. I’m opening this project for participation: if you want to join then leave a comment, or send me an email.

(Since Assembla seems to be going through some connectivity issues right now, here’s a link to a copy).