<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Algorithm Blogs &#187; challenge</title>
	<atom:link href="http://www.algorithm.co.il/blogs/index.php/tag/challenge/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.algorithm.co.il/blogs</link>
	<description>Algorithms, for the heck of it</description>
	<lastBuildDate>Thu, 22 Apr 2010 21:04:32 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Small Programming Challenge no. 5 &#8211; Generating a Permutation</title>
		<link>http://www.algorithm.co.il/blogs/index.php/programming/small-programming-challenge-no-5-generating-a-permutation/</link>
		<comments>http://www.algorithm.co.il/blogs/index.php/programming/small-programming-challenge-no-5-generating-a-permutation/#comments</comments>
		<pubDate>Wed, 11 Nov 2009 19:43:45 +0000</pubDate>
		<dc:creator>lorg</dc:creator>
				<category><![CDATA[Challenges]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[computer science]]></category>
		<category><![CDATA[challenge]]></category>

		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=493</guid>
		<description><![CDATA[I thought of this one quite a long time ago, and I believe that the idea behind it is pretty nice mathematically. I got the idea for it from Knuth&#8217;s &#8220;The Art of Computer Programming&#8221;.
The challenge is simple:
write a function that receives as arguments two numbers, n, and num such that 0 ]]></description>
			<content:encoded><![CDATA[<p>I thought of this one quite a long time ago, and I believe that the idea behind it is pretty nice mathematically. I got the idea for it from Knuth&#8217;s &#8220;The Art of Computer Programming&#8221;.</p>
<p>The challenge is simple:<br />
write a function that receives as arguments two numbers, n, and num such that 0 <= num < n!. This function needs to return an array (list) representing a permutation of the numbers 0..n-1. For each possible num, the function needs to return a different permutation, such that over all values of num, all possible permutations are generated. The order of permutations is up to you.</p>
<p>The function you write should do this in at most O(n) time &#038; space (Various O(nlogn) are also acceptable).<br />
Write your solutions in the comments, in [ LANG ] [/ LANG ] blocks (without the spaces) where LANG is preferably Python :). I will post my solution in a few days. As usual, the most efficient &#038; elegant solution wins.</p>
<p>Go!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.algorithm.co.il/blogs/index.php/programming/small-programming-challenge-no-5-generating-a-permutation/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>My solution to the counting sets challenge</title>
		<link>http://www.algorithm.co.il/blogs/index.php/programming/python/my-solution-to-the-counting-sets-challenge/</link>
		<comments>http://www.algorithm.co.il/blogs/index.php/programming/python/my-solution-to-the-counting-sets-challenge/#comments</comments>
		<pubDate>Mon, 07 Sep 2009 15:00:51 +0000</pubDate>
		<dc:creator>lorg</dc:creator>
				<category><![CDATA[Challenges]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[computer science]]></category>
		<category><![CDATA[challenge]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Sets]]></category>
		<category><![CDATA[Solution]]></category>

		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=325</guid>
		<description><![CDATA[A few days ago, I wrote up a challenge - to count the number of sets a given set is contained in.
In the comments, I touched briefly on the original problem from which the challenge was created, and I'll describe it in more depth here.
In the problem, I am given an initial group of sets, [...]]]></description>
			<content:encoded><![CDATA[<p>A few days ago, I wrote up a <a href="http://www.algorithm.co.il/blogs/index.php/computer-science/small-python-challenge-no-4-counting-sets/">challenge</a> - to count the number of sets a given set is contained in.</p>
<p>In the comments, I touched briefly on the original problem from which the challenge was created, and I'll describe it in more depth here.<br />
In the problem, I am given an initial group of sets, and then an endless 'stream of sets'. For each of the sets in the stream, I have to measure its uniqueness. relative to the initial group of sets. A set that is contained in only one set from the initial group is very unique, one that is contained in ten - not so much.</p>
<p>So how to solve this problem? My original solution is somewhat akin to the classic "lion-in-the-desert" problem, but more like the "blood test" story. I didn't find a link to the story, so I'll give it as I remember it. </p>
<p>In an army somewhere, it was discovered that at least one of the soldiers was sick and so had to be put in isolation until he heals. It is only possible to check for the disease via a blood test, but tests are expensive, and they didn't want to test all of the soldiers. What did they do?</p>
<p>They took enough blood from each soldier. Now, from each sample they took a little bit, and divided the samples into two groups. They mixed together the samples of each group, and tested the mixed sample. If the sample was positive - they repeated the process for the blood samples of all the soldiers in the matching group.</p>
<p>Now my solution is clear: let's build a tree of set unions. At bottom level will be the union of couples of sets. At the next level, unions of couples of couples of sets. So on, until we end up with just two sets, or even just one - if we are not sure the set is contained in any of the initial sets.</p>
<p>Testing is just like in the story. We'll start at the two biggest unions, and work our way down. There is an optimization though - if a set appears more than say, 10 times, it's not very unique, and its score is zeroed. In that case, we don't have to go down all the way, but stop as soon as we pass the 10 "positive result" mark.</p>
<p>Here's the code:</p>
<div class="syntax_hilite">
<div id="python-2">
<div class="python"><span style="color: #00007f;font-weight:bold;">class</span> SetGroup<span style="color: black;">&#40;</span><span style="color: #008000;">object</span><span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">def</span> <span style="color: #0000cd;">__init__</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, set_list<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; cur_level = <span style="color: #008000;">list</span><span style="color: black;">&#40;</span>set_list<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #008000;">self</span>.<span style="color: #000000;">levels</span> = <span style="color: black;">&#91;</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">while</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>cur_level<span style="color: black;">&#41;</span>&gt; <span style="color: #ff4500;">1</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #008000;">self</span>.<span style="color: #000000;">levels</span>.<span style="color: #000000;">append</span><span style="color: black;">&#40;</span>cur_level<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; cur_level = <span style="color: black;">&#91;</span>union<span style="color: black;">&#40;</span>couple<span style="color: black;">&#41;</span> <span style="color: #00007f;font-weight:bold;">for</span> couple <span style="color: #00007f;font-weight:bold;">in</span> blocks<span style="color: black;">&#40;</span>cur_level, <span style="color: #ff4500;">2</span><span style="color: black;">&#41;</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #008000;">self</span>.<span style="color: #000000;">levels</span>.<span style="color: #000000;">reverse</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></p>
<p>&nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">def</span> count<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, some_set, max_appear = <span style="color: #008000;">None</span><span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; indexes = <span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">for</span> level <span style="color: #00007f;font-weight:bold;">in</span> <span style="color: #008000;">self</span>.<span style="color: #000000;">levels</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; indexes = <span style="color: #dc143c;">itertools</span>.<span style="color: #000000;">chain</span><span style="color: black;">&#40;</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">2</span>*x <span style="color: #00007f;font-weight:bold;">for</span> x <span style="color: #00007f;font-weight:bold;">in</span> indexes<span style="color: black;">&#41;</span>, <span style="color: black;">&#40;</span><span style="color: #ff4500;">2</span>*x+<span style="color: #ff4500;">1</span> <span style="color: #00007f;font-weight:bold;">for</span> x <span style="color: #00007f;font-weight:bold;">in</span> indexes<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; indexes = <span style="color: black;">&#40;</span>x <span style="color: #00007f;font-weight:bold;">for</span> x <span style="color: #00007f;font-weight:bold;">in</span> indexes <span style="color: #00007f;font-weight:bold;">if</span> x &lt;len<span style="color: black;">&#40;</span>level<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; indexes = <span style="color: black;">&#91;</span>x <span style="color: #00007f;font-weight:bold;">for</span> x <span style="color: #00007f;font-weight:bold;">in</span> indexes <span style="color: #00007f;font-weight:bold;">if</span> some_set &lt;= level<span style="color: black;">&#91;</span>x<span style="color: black;">&#93;</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">if</span> max_appear <span style="color: #00007f;font-weight:bold;">is</span> <span style="color: #00007f;font-weight:bold;">not</span> <span style="color: #008000;">None</span> <span style="color: #00007f;font-weight:bold;">and</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>indexes<span style="color: black;">&#41;</span>&gt;= max_appear:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">return</span> max_appear<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">return</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>indexes<span style="color: black;">&#41;</span></div>
</div>
</div>
<p></p>
<p>Here's a link to the <a href="http://www.algorithm.co.il/sitecode/counting.py">full code</a>.</p>
<p>I didn't implement this solution right away. At first, I used the naive approach, of checking against each set. Then, when it proved to be too slow, I tried implementing the solution outlined by <a href="http://www.algorithm.co.il/blogs/index.php/computer-science/small-python-challenge-no-4-counting-sets/#comment-33424">Shenberg</a> and <a href="http://www.algorithm.co.il/blogs/index.php/computer-science/small-python-challenge-no-4-counting-sets/#comment-33366">Eric</a> in the comments to the challenge. Unfortunately, their solution proved to be very slow as well. I believe it's because some elements appear in almost all of the sets, and so computing the intersection for these elements takes a long time.<br />
Although originally I thought that my solution would suffer from some serious drawbacks (can you see what they are?), the max_appear limit removed most of the issues.</p>
<p>Implementing this solution was a major part of taking down the running time of the complete algorithm for the full problem I was solving from about 2 days, to about 15-20 minutes. That was one fun optimizing session :)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.algorithm.co.il/blogs/index.php/programming/python/my-solution-to-the-counting-sets-challenge/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Small Python Challenge No. 4 &#8211; Counting Sets</title>
		<link>http://www.algorithm.co.il/blogs/index.php/computer-science/small-python-challenge-no-4-counting-sets/</link>
		<comments>http://www.algorithm.co.il/blogs/index.php/computer-science/small-python-challenge-no-4-counting-sets/#comments</comments>
		<pubDate>Fri, 28 Aug 2009 12:00:58 +0000</pubDate>
		<dc:creator>lorg</dc:creator>
				<category><![CDATA[Challenges]]></category>
		<category><![CDATA[computer science]]></category>
		<category><![CDATA[challenge]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Sets]]></category>

		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=315</guid>
		<description><![CDATA[This is a problem that I encountered a short while ago. It seems like it could be easily solved very efficiently, but it's not as easy as it looks.
Let's say that we are given N (finite) sets of integers - S. For now we won't assume anything about them. We are also given another set, [...]]]></description>
			<content:encoded><![CDATA[<p>This is a problem that I encountered a short while ago. It seems like it could be easily solved very efficiently, but it's not as easy as it looks.<br />
Let's say that we are given N (finite) sets of integers - S. For now we won't assume anything about them. We are also given another set, a. The challenge is to write an efficient algorithm that will count how many sets from S contain a (or how many sets from S a is a subset of).</p>
<p>Let's call a single test a comparison. The naive algorithm is of course checking each of the sets, which means exactly N comparisons. The challenge - can you do better? When will your solution outperform the naive solution?</p>
<p>I will give my solution in a few days. Submit your solutions in the comments, preferably in Python. You can write readable code using [ python ] [ /python ] blocks, just without the spaces.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.algorithm.co.il/blogs/index.php/computer-science/small-python-challenge-no-4-counting-sets/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
		<item>
		<title>Some Assembly Required No. 1</title>
		<link>http://www.algorithm.co.il/blogs/index.php/programming/some-assembly-required-no-1/</link>
		<comments>http://www.algorithm.co.il/blogs/index.php/programming/some-assembly-required-no-1/#comments</comments>
		<pubDate>Sat, 12 Apr 2008 20:17:05 +0000</pubDate>
		<dc:creator>lorg</dc:creator>
				<category><![CDATA[Assembly]]></category>
		<category><![CDATA[Challenges]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Testing]]></category>
		<category><![CDATA[challenge]]></category>
		<category><![CDATA[vial]]></category>

		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/index.php/programming/some-assembly-required-no-1/</guid>
		<description><![CDATA[I've been working on some of the instruction tests in vial, and I wanted to test the implementation of LOOP variants. My objective was to make sure the vial version is identical to the real CPU version (as discussed here). To achieve this, I had to cover all of the essential behaviors of LOOP.
Well, using [...]]]></description>
			<content:encoded><![CDATA[<p>I've been working on some of the instruction tests in vial, and I wanted to test the implementation of LOOP variants. My objective was to make sure the vial version is identical to the real CPU version (<a href="http://www.algorithm.co.il/blogs/index.php/programming/issues-in-writing-a-vm-part-1/">as discussed here</a>). To achieve this, I had to cover all of the essential behaviors of LOOP.</p>
<p>Well, using the framework <a href="http://www.ragestorm.net/blogs/?p=58">Gil and I wrote</a>, I hacked up some code that should cover the relevant cases:</p>
<div class="syntax_hilite">
<div id="python-4">
<div class="python">code_template = <span style="color: #483d8b;">""</span><span style="color: #483d8b;">"<br />
mov edx, ecx ; control the start zf<br />
mov ecx, eax ; number of iterations<br />
mov eax, 0 ; will hold the result, also an iteration counter<br />
loop_start:</p>
<p>&nbsp; &nbsp; cmp eax, ebx&nbsp; &nbsp; ; check if we need to change zf<br />
&nbsp; &nbsp; setz dh<br />
&nbsp; &nbsp; xor dh, dl&nbsp; &nbsp; &nbsp; ; if required, invert zf<br />
&nbsp; &nbsp; inc eax&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;; count the iteration<br />
&nbsp; &nbsp; cmp dh, 0&nbsp; &nbsp; &nbsp; &nbsp;; set zf</p>
<p>&nbsp; &nbsp; loop%s loop_start<br />
"</span><span style="color: #483d8b;">""</span><br />
<span style="color: #00007f;font-weight:bold;">for</span> loop_kind <span style="color: #00007f;font-weight:bold;">in</span> <span style="color: black;">&#91;</span><span style="color: #483d8b;">''</span>,<span style="color: #483d8b;">'z'</span>,<span style="color: #483d8b;">'nz'</span><span style="color: black;">&#93;</span>:<br />
&nbsp; &nbsp; code_text = code_template % loop_kind<br />
&nbsp; &nbsp; c = FuncObject<span style="color: black;">&#40;</span>code_text<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">for</span> start_zf_value <span style="color: #00007f;font-weight:bold;">in</span> <span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span>,<span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">for</span> num_iters <span style="color: #00007f;font-weight:bold;">in</span> <span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>,<span style="color: #ff4500;">4</span>,<span style="color: #ff4500;">10</span><span style="color: black;">&#93;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">for</span> when_zf_changes <span style="color: #00007f;font-weight:bold;">in</span> <span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>,<span style="color: #ff4500;">2</span>,<span style="color: #ff4500;">15</span><span style="color: black;">&#93;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; c<span style="color: black;">&#40;</span>num_iters, when_zf_changes, start_zf_value<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; c.<span style="color: #000000;">check</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></div>
</div>
</div>
<p></p>
<p>Note that c(...) executes the code both on vial's VM, and on the real cpu. c.check() compares their return value (EAX) and flags after the execution. I also wanted to avoid other kinds of jumps in this test.</p>
<p>To check that the code ran the same number of times, I returned EAX as the number of iterations.<br />
All the games with edx are there to make sure that I'm testing different zf conditions.</p>
<p><strong>The challenge for today:</strong><br />
Can you write a shorter assembly snippet that tests the same thing?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.algorithm.co.il/blogs/index.php/programming/some-assembly-required-no-1/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Solution for the Random Selection Challenge</title>
		<link>http://www.algorithm.co.il/blogs/index.php/programming/python/solution-for-the-random-selection-challenge/</link>
		<comments>http://www.algorithm.co.il/blogs/index.php/programming/python/solution-for-the-random-selection-challenge/#comments</comments>
		<pubDate>Sat, 23 Feb 2008 23:50:23 +0000</pubDate>
		<dc:creator>lorg</dc:creator>
				<category><![CDATA[Math]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Statistics]]></category>
		<category><![CDATA[Utility Functions]]></category>
		<category><![CDATA[computer science]]></category>
		<category><![CDATA[challenge]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[probability distribution]]></category>
		<category><![CDATA[random selection]]></category>
		<category><![CDATA[Solution]]></category>

		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/index.php/programming/python/solution-for-the-random-selection-challenge/</guid>
		<description><![CDATA[A few days ago, I wrote up two small Python Challenges. Several people have presented solutions for the first challenge, and I also posted my solution in the comments there.
However, the second challenge remained unsolved, and I will present a solution for it in this post. 

First, a quick reminder:
Given a continuous, solvable bell-shaped function [...]]]></description>
			<content:encoded><![CDATA[<p>A few days ago, I <a href="http://www.algorithm.co.il/blogs/index.php/programming/python/small-python-challenge-no-3-random-selection/">wrote up</a> two small Python Challenges. Several people have presented solutions for the first challenge, and I also posted my solution in the comments there.</p>
<p>However, the second challenge remained unsolved, and I will present a solution for it in this post. </p>
<p><span id="more-85"></span></p>
<p>First, a quick reminder:</p>
<blockquote><p>Given a continuous, solvable bell-shaped function p(x) in [a,b], and a uniform random selection function (such as random.random()), write a function that will yield a distribution proportional to p(x)</p></blockquote>
<p>The requirement that p(x) be solvable is critical for the solution.<br />
As an example for such a function, consider y=e^(-x**2), in [-1,1]. It is bell shaped, and the solutions are given by ,[ -sqrt(-log(y)), sqrt(-log(y)) ].</p>
<p>I'll start with the code for the solution, as it might be clearer for the less mathematically inclined:</p>
<div class="syntax_hilite">
<div id="python-7">
<div class="python"><span style="color: #00007f;font-weight:bold;">def</span> random_dist<span style="color: black;">&#40;</span>dist_func, dist_solve_func<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; y = <span style="color: #dc143c;">random</span>.<span style="color: #dc143c;">random</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; a,b = dist_solve_func<span style="color: black;">&#40;</span>y<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; x = a+<span style="color: #dc143c;">random</span>.<span style="color: #dc143c;">random</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>*<span style="color: black;">&#40;</span>b-a<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">return</span> x</div>
</div>
</div>
<p></p>
<p><strong>Why does this work? (warning: math ahead)</strong> </p>
<p>We need to generate a probability distribution proportional to p(x). </p>
<p>(Note: the distribution will be exact (and not proportional) only if the integral of p(x) in [a,b] is 1. This is equivalent to requiring that the probabilities sum to 1 in the discrete case. We'll call such a function p(x) a standard distribution function.)</p>
<p>To generate the distribution, we note that if we select at random a y in [0,1], for each x, The probability of p(x)>=y is p(x). For example, if p(x) = 0.5, then given a random number 0<=y<=1, the condition p(x) >= y will happen exactly half the time.<br />
(If p(x) is a non-standard function, then the probability is instead only proportional to p(x))</p>
<p>Now, solving the problem is easy - we select a y at random. We find all the x such that p(x) >= y. We select one of these x at random, and the resulting x will be distributed proportionally to p(x). Selecting such an x at random becomes easy because of our requirements that p(x) be bell shaped and solvable. Given x0, x1 such that y=p(x0) and y=p(x1), we know that all x such that p(x)>y lie in between these two solutions.<br />
Hence, the second random() call chooses an x in the range [x0, x1].</p>
<p>In the following illustration of a bell function, two areas are marked. One is the result of selecting y=0.3, and the other of selecting y=0.2. The second area is indeed larger than the first, and contains the first area, as expected of a bell function. Thus, each x will be selected proportionally to p(x).</p>
<p><img src="http://www.algorithm.co.il/sitecode/random_dist3.png" border="2" alt="Illustration of a bell function" /></p>
<p><strong>Empirical Proof of the solution:</strong><br />
Again, code speaks louder (but more tersely) than words.</p>
<div class="syntax_hilite">
<div id="python-8">
<div class="python"><span style="color: #808080; font-style: italic;">#Choose a lot of random numbers </span><br />
<span style="color: #808080; font-style: italic;">#distributed proportionally to p</span><br />
result = <span style="color: black;">&#91;</span>random_select.<span style="color: #000000;">random_dist</span><span style="color: black;">&#40;</span>random_select.<span style="color: #000000;">p</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; random_select.<span style="color: #000000;">solve_p</span><span style="color: black;">&#41;</span> <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">for</span> i <span style="color: #00007f;font-weight:bold;">in</span> <span style="color: #008000;">xrange</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">10000</span><span style="color: black;">&#41;</span><span style="color: black;">&#93;</span><br />
<span style="color: #808080; font-style: italic;">#divide the numbers to buckets:</span><br />
<span style="color: #808080; font-style: italic;">#truncate to 2 digits after the point</span><br />
d = <span style="color: black;">&#91;</span>numpy.<span style="color: #000000;">floor</span><span style="color: black;">&#40;</span>x*<span style="color: #ff4500;">100</span><span style="color: black;">&#41;</span> <span style="color: #00007f;font-weight:bold;">for</span> x <span style="color: #00007f;font-weight:bold;">in</span> result<span style="color: black;">&#93;</span><br />
h = <span style="color: black;">&#123;</span><span style="color: black;">&#125;</span></p>
<p><span style="color: #808080; font-style: italic;">#count how many numbers ended up in each bucket</span><br />
<span style="color: #00007f;font-weight:bold;">for</span> x <span style="color: #00007f;font-weight:bold;">in</span> d:<br />
&nbsp; &nbsp; h<span style="color: black;">&#91;</span>x<span style="color: black;">&#93;</span> = h.<span style="color: #000000;">get</span><span style="color: black;">&#40;</span>x,<span style="color: #ff4500;">0</span><span style="color: black;">&#41;</span>+<span style="color: #ff4500;">1</span>.<span style="color: #ff4500;">0</span>/<span style="color: #008000;">len</span><span style="color: black;">&#40;</span>result<span style="color: black;">&#41;</span></p>
<p><span style="color: #808080; font-style: italic;">#plot the 'height' of each bucket</span><br />
xvals = result<br />
yvals = <span style="color: black;">&#91;</span>h<span style="color: black;">&#91;</span>numpy.<span style="color: #000000;">floor</span><span style="color: black;">&#40;</span>x*<span style="color: #ff4500;">100</span><span style="color: black;">&#41;</span><span style="color: black;">&#93;</span> <span style="color: #00007f;font-weight:bold;">for</span> x <span style="color: #00007f;font-weight:bold;">in</span> result<span style="color: black;">&#93;</span></p>
<p>
<span style="color: #808080; font-style: italic;">#plot our original distribution, </span><br />
<span style="color: #808080; font-style: italic;">#aligned with the graph (but not exactly).</span><br />
avg = <span style="color: #008000;">sum</span><span style="color: black;">&#40;</span>yvals<span style="color: black;">&#41;</span>/<span style="color: #008000;">len</span><span style="color: black;">&#40;</span>yvals<span style="color: black;">&#41;</span><br />
m = <span style="color: #008000;">max</span><span style="color: black;">&#40;</span>yvals<span style="color: black;">&#41;</span><br />
xvals2 = numpy.<span style="color: #000000;">arange</span><span style="color: black;">&#40;</span>-<span style="color: #ff4500;">3</span>,<span style="color: #ff4500;">3</span>,<span style="color: #ff4500;">0</span>.<span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span><br />
yvals2 = <span style="color: black;">&#91;</span>random_select.<span style="color: #000000;">p</span><span style="color: black;">&#40;</span>x<span style="color: black;">&#41;</span>*m <span style="color: #00007f;font-weight:bold;">for</span> x <span style="color: #00007f;font-weight:bold;">in</span> xvals2<span style="color: black;">&#93;</span></p>
<p>pylab.<span style="color: #000000;">plot</span><span style="color: black;">&#40;</span>xvals, yvals, <span style="color: #483d8b;">"r+"</span><span style="color: black;">&#41;</span><br />
pylab.<span style="color: #000000;">plot</span><span style="color: black;">&#40;</span>xvals2, yvals2, <span style="color: #483d8b;">"b-"</span><span style="color: black;">&#41;</span></div>
</div>
</div>
<p></p>
<p>This code generates the following graph, which proves this solution works:<br />
<img src="http://www.algorithm.co.il/sitecode/random_dist2.png" border="2" alt="Proof of the random selection solution" /></p>
<p><strong>A few thoughts</strong><br />
This problem may be seen as a standard computer science 'computation with an oracle' problem. Here, our oracle is random.random(), and we want to compute a function. In these kind of cases, it is customary to ask:<br />
1. What may be generally computed with this oracle?<br />
2. Given a specific problem, what is the minimum number of oracle calls required to solve it?</p>
<p>The answer to the first question is the easier one - we can generalize this solution to any continuous solvable distribution p(x) in [a,b]. Since the same reasoning applies - we just need to be able to find all x such that p(x)>y given a specific y. If our solutions are [x,y,z,w], and p(a) &lt; y, then we need to select an x from [x,y] U [z,w].This is easily achieved in any number of ways. For example, select one of the ranges with the discrete selection (the first challenge), and then select an x from the selected range.</p>
<p>The answer to the second question is a bit more complicated. The solution described above for bell shaped curves solves the problem with just two calls to random(). To beat it, a solution with just one call is required. This seems daunting at first, but it is possible, even if complicated. What we want is a function that will convert the uniform density of random()'s output to our distribution's density.<br />
Consider the graph of a function as a path. If we select a random point on the path, and check its x-coordinate, we'll see that the probability distribution of the x's proportional to the absolute inclination of the function at that point. Since inclination is equivalent to the differential, if we do this process to the integral of our distribution function, we'll get the required distribution.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.algorithm.co.il/blogs/index.php/programming/python/solution-for-the-random-selection-challenge/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Small Python Challenge No. 3 &#8211; Random Selection</title>
		<link>http://www.algorithm.co.il/blogs/index.php/programming/python/small-python-challenge-no-3-random-selection/</link>
		<comments>http://www.algorithm.co.il/blogs/index.php/programming/python/small-python-challenge-no-3-random-selection/#comments</comments>
		<pubDate>Sun, 10 Feb 2008 17:58:35 +0000</pubDate>
		<dc:creator>lorg</dc:creator>
				<category><![CDATA[Challenges]]></category>
		<category><![CDATA[Math]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[challenge]]></category>
		<category><![CDATA[probability distribution]]></category>
		<category><![CDATA[random selection]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/index.php/programming/python/small-python-challenge-no-3-random-selection/</guid>
		<description><![CDATA[This time I'll give two related problems, both not too hard.
Lets warm up with the first:
You have a mapping between items and probabilities. You need to choose each item with its probability.
For example, consider the items ['good', 'bad', 'ugly'], with probabilities of [0.5, 0.3, 0.2] accordingly. Your solution should choose good with probability 50%, bad [...]]]></description>
			<content:encoded><![CDATA[<p>This time I'll give two related problems, both not too hard.</p>
<p>Lets warm up with the first:</p>
<p>You have a mapping between items and probabilities. You need to choose each item with its probability.</p>
<p>For example, consider the items ['good', 'bad', 'ugly'], with probabilities of [0.5, 0.3, 0.2] accordingly. Your solution should choose good with probability 50%, bad with 30% and ugly with 20%.</p>
<p>I came to this challenge because just today I had to solve it, and it seems like a common problem. Hence, it makes sense to ask 'what is the best way?'.</p>
<p>The second problem is slightly harder:</p>
<p>Assume a bell shaped function p(x) that you can 'solve'. This means that given a value y, you can get all x such that p(x)=y. For example, sin(x)^2 in [0,pi] is such a function. Given a function such as Python's random.random() that yields a uniform distribution of values in [0,1), write a function that yields a distribution proportional to p(x) in the appropriate interval.</p>
<p>For example, consider the function p(x) = e^(-x^2) in [-1,1]. Since p(0) = 1, and p(0.5)~0.779, the value 0 should be p(0)/p(0.5)~1.28 times more common than 0.5.</p>
<p>As usual, the preferred solutions are the elegant ones. Go!</p>
<p><strong>note:</strong> please post your solutions in the comments, using [ python]...[ /python] tags (but without the spaces in the tags).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.algorithm.co.il/blogs/index.php/programming/python/small-python-challenge-no-3-random-selection/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Small Python Challenge No. 2 &#8211; LRU Cache</title>
		<link>http://www.algorithm.co.il/blogs/index.php/programming/python/small-python-challenge-no-2-lru-cache/</link>
		<comments>http://www.algorithm.co.il/blogs/index.php/programming/python/small-python-challenge-no-2-lru-cache/#comments</comments>
		<pubDate>Fri, 28 Dec 2007 10:20:52 +0000</pubDate>
		<dc:creator>lorg</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Challenges]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Utility Functions]]></category>
		<category><![CDATA[computer science]]></category>
		<category><![CDATA[challenge]]></category>
		<category><![CDATA[LRU-cache]]></category>

		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/index.php/programming/python/small-python-challenge-no-2-lru-cache/</guid>
		<description><![CDATA[Caching is easy. Consider the cache I used to optimize the recursive spring:


class _NotInDict&#40;object&#41;:
&#160; &#160; pass
_NotInDict = _NotInDict&#40;&#41;
def cached&#40;func&#41;:
&#160; &#160; cache = &#123;&#125;
&#160; &#160; def wrapper_func&#40;*args&#41;:
&#160; &#160; &#160; &#160; prev_result = cache.get&#40;args, _NotInDict&#41;
&#160; &#160; &#160; &#160; if prev_result is _NotInDict:
&#160; &#160; &#160; &#160; &#160; &#160; result = func&#40;*args&#41;
&#160; &#160; &#160; &#160; &#160; &#160; cache&#91;args&#93; = [...]]]></description>
			<content:encoded><![CDATA[<p>Caching is easy. Consider the cache I used to <a href="http://www.algorithm.co.il/blogs/index.php/programming/python/optimizing-the-recursive-spring/">optimize the recursive spring</a>:</p>
<div class="syntax_hilite">
<div id="python-10">
<div class="python"><span style="color: #00007f;font-weight:bold;">class</span> _NotInDict<span style="color: black;">&#40;</span><span style="color: #008000;">object</span><span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">pass</span><br />
_NotInDict = _NotInDict<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
<span style="color: #00007f;font-weight:bold;">def</span> cached<span style="color: black;">&#40;</span>func<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; cache = <span style="color: black;">&#123;</span><span style="color: black;">&#125;</span><br />
&nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">def</span> wrapper_func<span style="color: black;">&#40;</span>*args<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; prev_result = cache.<span style="color: #000000;">get</span><span style="color: black;">&#40;</span>args, _NotInDict<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">if</span> prev_result <span style="color: #00007f;font-weight:bold;">is</span> _NotInDict:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; result = func<span style="color: black;">&#40;</span>*args<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; cache<span style="color: black;">&#91;</span>args<span style="color: black;">&#93;</span> = result<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">return</span> result<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">return</span> prev_result<br />
&nbsp; &nbsp; <span style="color: #00007f;font-weight:bold;">return</span> wrapper_func</div>
</div>
</div>
<p></p>
<p><img src="http://www.algorithm.co.il/sitecode/treasure_chest_color.png" align="right" height="240" width="228" /></p>
<p>This kind of cache is simple and effective (especially for recursions), and may be used in all sorts of situations. However, sometimes you want a size limited cache. In that case you have to decide on the criterion used to decide which items to throw away. There are many kinds of criteria used, for further reading check out <a href="http://en.wikipedia.org/wiki/Cache_algorithms">wikipedia</a>.</p>
<p>For now I'd like to discuss the LRU cache though. LRU stands for Least Recently Used, which means that you throw away the items you didn't use for a long time. Time in this case is measured by actions. I thought of this type of cache when I worked on the recursive spring. Since each step in the 'recursivation' used two samples of the previous step, caching was an obvious choice, and if I had to size limit my cache, LRU whould be the type of cache to use, as you could be certain that the older samples would not have to be used until the next drawing.</p>
<p>The challenge for the weekend is to write an LRU cache in Python. The cache has to be general - support hash-able keys and  any cache size required. It has to be efficient - in the size of the cache and the time it takes for a lookup and an update. Once the standard requirements have been met, the big competition should be on elegance. I wrote a benchmark implementation, which is efficient, but not fast. Once I see some solutions, I'll talk a bit about mine, which is an interesting case-study.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.algorithm.co.il/blogs/index.php/programming/python/small-python-challenge-no-2-lru-cache/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>A classic programming challenge, in Python</title>
		<link>http://www.algorithm.co.il/blogs/index.php/programming/python/a-classic-programming-challenge-in-python/</link>
		<comments>http://www.algorithm.co.il/blogs/index.php/programming/python/a-classic-programming-challenge-in-python/#comments</comments>
		<pubDate>Wed, 26 Dec 2007 12:15:45 +0000</pubDate>
		<dc:creator>lorg</dc:creator>
				<category><![CDATA[Challenges]]></category>
		<category><![CDATA[Programming Philosophy]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[computer science]]></category>
		<category><![CDATA[challenge]]></category>
		<category><![CDATA[Quine]]></category>

		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/index.php/programming/python/a-classic-programming-challenge-in-python/</guid>
		<description><![CDATA[It has become a tradition for computer scientists to create various self referential 'strange loops'. Traditions such as writing a compiler in the language it compiles are actually quite useful - and also very interesting. This tradition also branched to another one (also mentioned in the linked article) of writing programs that output their own [...]]]></description>
			<content:encoded><![CDATA[<p>It has become a tradition for computer scientists to create various self referential 'strange loops'. Traditions such as writing a compiler in the language it compiles are actually quite useful - and also <a href="http://cm.bell-labs.com/who/ken/trust.html">very interesting</a>. This tradition also branched to another one (also mentioned in the linked article) of writing programs that output their own source (without disk access and other dirty tricks).</p>
<p>The challenge is obviously to write such a program in Python, in as few lines as possible. Here is <a href="http://www.algorithm.co.il/sitecode/self_printer.py" title="A Python script that prints its own source">my solution</a>, which is at two lines. I urge you to try it for yourself before looking, it is a very educating challenge. I'll be very much interested in seeing a one-liner for this problem, or a proof that such a one-liner does not exist.</p>
<p>If you are interested in the bigger challenge, of writing an interpreter for Python in Python itself, go check out <a href="http://codespeak.net/pypy/dist/pypy/doc/news.html">PyPy</a> first.</p>
<p>For those interested in other 'strange loops', find a copy of <a href="http://en.wikipedia.org/wiki/G%C3%B6del,_Escher,_Bach">'Godel Escher Bach'</a>. If you happen to live in Israel, and can come to Haifa, I might even lend you my copy (once I get it back :)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.algorithm.co.il/blogs/index.php/programming/python/a-classic-programming-challenge-in-python/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Small Python Challenge No. 1</title>
		<link>http://www.algorithm.co.il/blogs/index.php/programming/python/small-python-challenge-no-1/</link>
		<comments>http://www.algorithm.co.il/blogs/index.php/programming/python/small-python-challenge-no-1/#comments</comments>
		<pubDate>Sun, 26 Aug 2007 23:14:16 +0000</pubDate>
		<dc:creator>lorg</dc:creator>
				<category><![CDATA[Challenges]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[challenge]]></category>

		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/index.php/programming/python/small-python-challenge-no-1/</guid>
		<description><![CDATA[I bet I already wrote that sometime earlier, but I found myself today writing the following function:
def blocks(seq, block_len):
    """blocks(range(5),2) -&#62; [[0, 1], [2, 3], [4]]"""
    seq_len = len(seq)
    if seq_len%block_len == 0:
        num_blocks = seq_len/block_len
    [...]]]></description>
			<content:encoded><![CDATA[<p>I bet I already wrote that sometime earlier, but I found myself today writing the following function:</p>
<pre><code>def blocks(seq, block_len):
    """blocks(range(5),2) -&gt; [[0, 1], [2, 3], [4]]"""
    seq_len = len(seq)
    if seq_len%block_len == 0:
        num_blocks = seq_len/block_len
    else:
        num_blocks = 1 + (seq_len/block_len)
    result = [[] for i in xrange(num_blocks)]
    for idx, obj in enumerate(seq):
        result[idx/block_len].append(obj)
    return result

</code></pre>
<p>I am not satisfied with this implementation.<br />
The challenge is to write it in a more elegant and short manner. Functions from Python's standard modules are considered valid solutions.</p>
<p>Ready? GO!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.algorithm.co.il/blogs/index.php/programming/python/small-python-challenge-no-1/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
	</channel>
</rss>
