Categories
Cryptography Math

10 Awesome Theorems & Results

When I look back at various mathematical courses I took, most have at least one theorem that I really liked. Usually I like it because the proof has a surprising trick, sometimes it’s because of the unexpected conclusion, or maybe the unintuitive feel to it. In other cases it’s just the elegance of the proof, or the result itself.
Without further ado, here’s a selection of my favorite theorems, in no particular order:

1. Linear Algebra: the Cayley Hamilton theorem. When I first grokked the fact that you can substitute matrices for the variables in polynomials, I was awestruck. Then I learned that you can define eA by using a Taylor series, but the fun doesn’t stop there. Using the Eigenvalues you can greatly simplify the calculation, and it all “works out the same” (i.e., if A=P-1DP and D is diagonal, then p(A) = P-1p(D)P. This works also for Jordan forms). Also, since you can show that complex numbers are isomorphic to the 2×2 matrices of the form [[a, b], [-b, a]], and that the calculations were exactly the same – well, everything “fell into place for me”. At the time it seemed to be one of the joys of Mathematics.

2. Calculus: the Bolzano-Weierstrass Theorem. One of the first non trivial results you learn in calculus, I originally learned the version that says: “Every bounded infinite set has a limit point”, and its proof was a bit more elegant in my eyes than the proof of the Wikipedia version. I liked it so much that one time when I was in boot camp in the service, I worked it out again just to keep my mind working. Good times.

3. Probability: The elegant result of V(x) = E(V(x|y)) + V(E(x|y)). Just the sight of it makes one sigh with contentedness, and the result itself is very nice.

4. Calculus, again: Stokes’ theorem and its friends. Very useful and non intuitive, in layman’s terms it says that you can reason about what happens in an area just by knowing about its perimeter.

5. Numerical Analysis: Richardson Extrapolation: one of the most elegant forms of bootstrapping, you start with a simple approximation method as a building block, and at the end you get a very strong high-quality approximation.

6. Computability: The Parameter theorem. Especially elegant, it basically gives the mathematical definition of the “bind” function for function parameters. In simple terms it uses the source code of a function f(x, y), to find the source code of a function g(y) such that g(y) = f(a, y) for some a. The nice thing about it is that it works only on source code, without calling the function themselves.
This theorem had the added bonus that once I grokked it, the test in computability was very, very easy :)

7. Functional analysis: here it’s a relatively minor result that I ended up remembering distinctly: Given z1.. zn which are linearly independent in E, show that there exists a d such that for each w1…wn that follow ||wi – zi|| < d for each i, are also linearly independent. The footnote says that such a finite, linearly independent group is called stable. When visualizing I think of it this way: given a such a group, kick it. As long as you don’t kick it too strongly – it will stay linearly independent. Now that’s stable.

8. Mathematical Logic: The Compactness theorem: “a set of first-order sentences has a model if and only if every finite subset of it has a model”. One direction is almost trivial, but the other is deep. When studying for the test in this course, I remember being stuck for days on an exercise that required the use of this theorem. Once I fully understood the method of its use, it became a favorite.
(By the way, the exercise was the following: Let G a countable group of first order statements, and p a first order statement. Show that if p is true in every countable model of G, than G |= p.)

9. Cryptography: I’ve learned a bit of cryptography on my own before taking the cryptography course. When I did though, two methods were especially memorable: The first was the “Meet in the Middle” attack. Not to be confused with “Man in the Middle”, this method allows one to attack symmetric ciphers constructed by repeatedly applying a simpler cipher. This known plaintext attack got its name from its method of operation: the attacker calculates all possible decryptions the ciphertext and stores them in a lookup table. Then, he calculates all encryptions of the plaintext and looks them up in that lookup table. Once a result is found – the combination of the encryption and the decryption keys used is the final key of the composed cipher.

10. The second cryptography result that I liked was secret sharing. Trivial secret sharing is so simple, and yet effective, that when I first learned it I thought: “how come I didn’t think of this before?”.

There are obviously many more elegant theorems, some of which I’ve learned in my studies. I sure hope to learn a few more. Still, these are special. As a highschool math teacher once told us about the Pythagorean theorem: “I want you to remember the proof even if I wake you in the middle of the night”. The theorems in this short list come close to that ideal.

Now I wonder – what are your favorite theorems?

Categories
Challenges computer science Programming Python

The mathematics behind the solution for Challenge No. 5

If you take a look at the various solutions people proposed for the last challenge of generating a specific permutation, you’ll see that they are very similar. Most of them are based on some form of div-mod usage. The reason this is so, is because all of these solutions are using the Factorial Base.

What does that mean?
Note that we usually encounter div-mods when we want to find the representation of a number in a certain base. That should already pique your interest. Now consider that a base’s digits need not have the same weight. For example, consider how we count the number of seconds since the start of the week:

seconds of the last minute, A (at most 60-1)
minutes of the last hour, B (at most 60-1)
hours of the last day, C (at most (24-1)
days of the last week, D (at most 7-1)

So given A, B, C, D, we would say that the number of seconds is:
A + 60*B + 24*C + 7*D. This certainly looks like a base transformation. To go back, we would use divmod.

The factorial base is just the same, with the numbers n, n-1, … 1. Note that in the factorial base, you can only represent a finite number of numbers – n!. This should not be surprising – this is what we set out to do in the first place!
The thing that I found really amazing about this is that all the people to whom I posed this challenge came up with almost the same “way” of solving it.

Other interesting curiosities regarding bases can be found in Knuth’s book, “The Art of Computer Programming”, volume 2, Section 4.1.

Categories
Algorithms Math Programming Projects Python Utility Functions

Fast Peak Autocorrelation

So, I was at geekcon. It was a blast.
The lunar lander
There were many interesting projects, and I didn’t get to play with them all. I did get to work a bit on the Lunar Lander from last year, and this year it was finished successfully. My part was the PC game which interfaced with the microcontroller controlling the lander. As you probably guessed, it was written in Python.

This year, as promised, I worked on the Automatic Improviser. I worked on it with Ira Cherkes.

While the final version worked, it didn’t work well enough, and there is still much work to be done. Still, we had excellent progress.
By the way, I know this subject has been tackled before, and I still wanted to try it myself, without reading too much literature about it.

One of the components of the system is a beat recognizer. My idea to discover the beat is simple: find the envelope (similar to removing AM modulation), and then find the low “frequency” of the envelope.
Instead of doing a Fast Fourier Transform for beat recognition, we were advised that autocorellation will do the trick better and faster. However, when trying to autocorellate using scipy.signal.correlate we discovered that autocorellation was too slow for real time beat recognition, and certainly wasteful.

To solve this issue, we decided to first do peak detection on the envelope, and then autocorellate the peaks. Since there shouldn’t be too many peaks, this has the potential of being really quick. However, there was no standard function to do autocorellation of peaks, so we implemented it ourselves. We were pretty rushed, so we worked fast. Here’s the code:

def autocorrelate_peaks(peaks):
    peaks_dict = dict(peaks)
    indexes = set(peaks_dict.keys())
    deltas = set()
    for i in indexes:
        for j in indexes:
            if j>i:
                continue
            deltas.add(i-j)
 
    result = {}
    for d in deltas:
        moved = set(i+d for i in indexes)
        to_mult = moved & indexes
        assert to_mult <= indexes
        s = sum(peaks_dict[i-d]*peaks_dict[i] for i in to_mult)
        result[d] = s
    return result

This function takes as input a list of tuples, each of the form (peak_index, peak_value), and returns a mapping between non-zero corellation offsets and their values.
Here’s is a sample run:

In [7]: autocorrelate_peaks([(2, 2), (5,1), (7,3)])
Out[7]: {0: 14, 2: 3, 3: 2, 5: 6}

After implementing this function, our recognition loop was back to real-time, and we didn’t have to bother with optimizing again.

Categories
Math Programming Python

Checking the ulam spiral

In the following post, the Ulam spiral is described. It’s a very simple object – write down consecutive natural numbers starting from 41 in a square spiral. Curiously, the numbers on the diagonal are primes:

Ulam spiral

Reading this post, I immediately wanted to check how long this property holds. The original blog post suggests:

Remarkably, all the numbers on the red diagonal are prime — even when the spiral is continued into a 20 × 20 square.

Yet it doesn’t mention if it stops there or not. Without peeking (in Wikipedia), I wrote a simple program to check:

import sys
import psyco
psyco.full()
 
UP = 0
LEFT = 1
DOWN = 2
RIGHT = 3
 
DIRECTIONS = {UP: (0, -1),
              LEFT: (-1, 0),
              DOWN: (0, 1),
              RIGHT: (1, 0)}
 
def generate_ulam_seq(n):
    start = 41
    x = 0
    y = 0
    min_x = 0
    max_x = 0
    min_y = 0
    max_y = 0
    value = start
    direction = UP
    for i in xrange(n):
        yield x, y, value
        value += 1
        add_x, add_y = DIRECTIONS[direction]
        x, y = x + add_x, y + add_y
        if x < min_x:
            direction = (direction+1) % 4
            min_x = x
        if x > max_x:
            direction = (direction+1) % 4
            max_x = x
        if y < min_y:
            direction = (direction+1) % 4
            min_y = y
        if y > max_y:
            direction = (direction+1) % 4
            max_y = y
 
def is_prime(n):
    return all(n%x != 0 for x in xrange(2, int((n**0.5) + 1)))
 
def main():
    for x, y, v in generate_ulam_seq(int(sys.argv[1])):
        if x == y and not is_prime(v):
            print x, y, v
 
if __name__ == '__main__':
    main()

Running it, we get:

> ulam.py 10000
20 20 1681
-21 -21 1763
22 22 2021
-25 -25 2491
28 28 3233
-33 -33 4331
38 38 5893
-41 -41 6683
41 41 6847
42 42 7181
-44 -44 7697
-45 -45 8051
-46 -46 8413
48 48 9353

So the property doesn’t hold for a very long time. Reading up on Wikipedia, it seems the Ulam spiral still has interesting properties related to primes in higher sizes. Maybe I’ll plot that one as well in a future post.

PS: Regarding primality testing, this is a cute, quick&dirty one liner. In the range of numbers we’re discussing, it is ‘good enough’.

Categories
Math Programming Python

Fun with Matrices

I’ll let the code speak for itself:

In [81]: m = Matrix(array([[1.0,1.0],[0.0,1.0]]))
 
In [82]: def my_sqrt(x, num_iters):
   ....:     r = 0.5*x
   ....:     for i in xrange(num_iters):
   ....:             r = 0.5*(r+x/r)
   ....:     return r
   ....:
 
In [83]: m*m
Out[83]:
array([[ 1.,  2.],
       [ 0.,  1.]])
 
In [84]: my_sqrt(m*m, 10)
Out[84]:
array([[ 1.,  1.],
       [ 0.,  1.]])
 
In [85]: m
Out[85]:
array([[ 1.,  1.],
       [ 0.,  1.]])

It’s always fun to see the math work out. At first when I learned that e^A for a matrix A may also be defined using the Taylor series in the usual way, I was really amazed. It still amazes me that this stuff works out so well. This is another kind of beauty.

Categories
Math Programming Python

Computation over Zp in Python

Lately I’ve been working a lot on my Algebric Structures homework. One of the reasons I don’t blog as much as I should. While working on my homework, I had to factor some polynomials over Z5 – the field containing the numbers {0,1,2,3,4}. The trick was that given a 4th degree polynomial without any roots, you could only factor it to two 2nd degree polynomials.

That turns out not to be that hard, but a bit of work – especially if you dislike just computing values. So I wrote a simple script to solve this little problem for me. While writing that script, I saw that it could be really fun to write a general Zp class. So naturally, I fired Google up, and tried looking for existing implementations. I didn’t find any (except some python implementations of many number thoery algorithms). Being quite a small amount of work, I hacked up something. After finishing with it, I wrote a polynomial class that can accept any given “number class”. There is actually a public implementation of real valued polynomials in SciPy, but not for integer valued or Zp-valued polynomials. After some more work my polynomial class was also finished. I actually used that class for some computations later on.

I added polynomial division, just for the heck of it, even though I didn’t use it. I hope this little script will be useful to someone.

Categories
Algorithms Math

Reachability on arbitrary maps

The other day I had an idea. What if you took a map of some country, and from each coordinate computed the time it took to any other coordinate. The basic assumptions are that you can drive on road by car from parking lot to parking lot, and from parking lot to another point you can get only by foot. You can also add trains, and airplanes, each having less routes, but are actually faster. So for example, traveling from city A’s center to city B’s center is quite fast, but traveling from city A’s center to some point near the road from A to B will take some time.

This is a bit similar to computations done by game AI’s to determine how to get from point A to point B on a game board.

Now let us assume assume we want to draw the resulting graph on some kind of a 3D map, where the distance of each point from other points is determined by the time it takes to reach from that point to the others. Actually, it might not always be possible to plot this graph on a 3D map as a friend pointed to me, as 3 neighboring points already lock our point in space. But let us say that you can plot this 3D map. A’s center and B’s center will be near each other, a bit like a paper folding – which is incidentally the most common depiction of ‘science fiction wormholes’ explanations to laymen. It was pretty nice to try and visualize this kind of map. It is also interesting to think what is the least number of dimensions required to plot a given map.

Although I couldn’t come up with ideas about how this kind of visualization might be useful, it was still fun to to think of.