<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Small Python Challenge No. 4 &#8211; Counting Sets</title>
	<atom:link href="http://www.algorithm.co.il/blogs/computer-science/small-python-challenge-no-4-counting-sets/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.algorithm.co.il/blogs/challenges/small-python-challenge-no-4-counting-sets/</link>
	<description>Algorithms, for the heck of it</description>
	<lastBuildDate>Tue, 21 Jun 2011 21:07:08 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
	<item>
		<title>By: admin</title>
		<link>http://www.algorithm.co.il/blogs/challenges/small-python-challenge-no-4-counting-sets/#comment-217</link>
		<dc:creator>admin</dc:creator>
		<pubDate>Sat, 17 Jul 2010 15:10:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=315#comment-217</guid>
		<description>Anonymous:
1. Each computation of D[i] may be considered equivalent to a comparison.
2. Your solution does O(N) work for each given a.</description>
		<content:encoded><![CDATA[<p>Anonymous:<br />
1. Each computation of D[i] may be considered equivalent to a comparison.<br />
2. Your solution does O(N) work for each given a.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous...</title>
		<link>http://www.algorithm.co.il/blogs/challenges/small-python-challenge-no-4-counting-sets/#comment-216</link>
		<dc:creator>Anonymous...</dc:creator>
		<pubDate>Tue, 15 Jun 2010 15:05:15 +0000</pubDate>
		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=315#comment-216</guid>
		<description>Yeah I think I must be misunderstanding something. Are set comparisons the only thing that costs time? If so, here&#039;s my solution...

1. compute D[i] = a - S[i] for each S[i]
2. count = 0
3. for each D[i], if len(D[i]) = 0, count = count + 1
4. print count

Yeah that doesn&#039;t really do any set comparisons. Probably not what you were going for. Posted for the lulz.</description>
		<content:encoded><![CDATA[<p>Yeah I think I must be misunderstanding something. Are set comparisons the only thing that costs time? If so, here&#8217;s my solution&#8230;</p>
<p>1. compute D[i] = a &#8211; S[i] for each S[i]<br />
2. count = 0<br />
3. for each D[i], if len(D[i]) = 0, count = count + 1<br />
4. print count</p>
<p>Yeah that doesn&#8217;t really do any set comparisons. Probably not what you were going for. Posted for the lulz.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: eric</title>
		<link>http://www.algorithm.co.il/blogs/challenges/small-python-challenge-no-4-counting-sets/#comment-215</link>
		<dc:creator>eric</dc:creator>
		<pubDate>Mon, 31 Aug 2009 16:30:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=315#comment-215</guid>
		<description>I&#039;m looking forward to seeing your solution !</description>
		<content:encoded><![CDATA[<p>I&#8217;m looking forward to seeing your solution !</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rani</title>
		<link>http://www.algorithm.co.il/blogs/challenges/small-python-challenge-no-4-counting-sets/#comment-214</link>
		<dc:creator>Rani</dc:creator>
		<pubDate>Mon, 31 Aug 2009 09:26:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=315#comment-214</guid>
		<description>True, but the notion of preprocessing the set S to get a chain data structure is one of the basic, recurring, ideas and it is indeed a powerful &quot;trick&quot;.

BTW, the paper I linked to has a practical algorithm that computes the height of *all* sets in S if this is what the original application required.</description>
		<content:encoded><![CDATA[<p>True, but the notion of preprocessing the set S to get a chain data structure is one of the basic, recurring, ideas and it is indeed a powerful &#8220;trick&#8221;.</p>
<p>BTW, the paper I linked to has a practical algorithm that computes the height of *all* sets in S if this is what the original application required.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: lorg</title>
		<link>http://www.algorithm.co.il/blogs/challenges/small-python-challenge-no-4-counting-sets/#comment-213</link>
		<dc:creator>lorg</dc:creator>
		<pubDate>Mon, 31 Aug 2009 07:20:25 +0000</pubDate>
		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=315#comment-213</guid>
		<description>Rani:
That reminds me a very nice story - how to differentiate &quot;mathematical programmers&quot; from &quot;practical programmers&quot;. A few years back I had to sort a set of sets, and articles on ordering posets didn&#039;t seem to do much help.
It turns out that when you ask &quot;mathematical programmers&quot; regarding this problem - they will also look at posets.

The amusing part comes when you figure out a &quot;cheating solution&quot; - for each set s compute (len(s), hash(s)), and sort according to that. If you do, it is guaranteed that if s1 &lt;= s2, it will be before it in the ordering. The hash is there to keep the ordering well-defined for sets with the same cardinality.
It turns out that less mathematically inclined programmers think of this solution much faster :)</description>
		<content:encoded><![CDATA[<p>Rani:<br />
That reminds me a very nice story &#8211; how to differentiate &#8220;mathematical programmers&#8221; from &#8220;practical programmers&#8221;. A few years back I had to sort a set of sets, and articles on ordering posets didn&#8217;t seem to do much help.<br />
It turns out that when you ask &#8220;mathematical programmers&#8221; regarding this problem &#8211; they will also look at posets.</p>
<p>The amusing part comes when you figure out a &#8220;cheating solution&#8221; &#8211; for each set s compute (len(s), hash(s)), and sort according to that. If you do, it is guaranteed that if s1 <= s2, it will be before it in the ordering. The hash is there to keep the ordering well-defined for sets with the same cardinality.<br />
It turns out that less mathematically inclined programmers think of this solution much faster :)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Rani</title>
		<link>http://www.algorithm.co.il/blogs/challenges/small-python-challenge-no-4-counting-sets/#comment-212</link>
		<dc:creator>Rani</dc:creator>
		<pubDate>Sun, 30 Aug 2009 22:07:16 +0000</pubDate>
		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=315#comment-212</guid>
		<description>Here&#039;s an academic viewpoint of closely related problems.
http://arxiv.org/abs/0707.1532

Main theme: the complexity heavily depends on the width of the poset, that is, the size of the largest antichain. An antichain is a subset A of S such that no two sets in A are a subset of one another.</description>
		<content:encoded><![CDATA[<p>Here&#8217;s an academic viewpoint of closely related problems.<br />
<a href="http://arxiv.org/abs/0707.1532" rel="nofollow">http://arxiv.org/abs/0707.1532</a></p>
<p>Main theme: the complexity heavily depends on the width of the poset, that is, the size of the largest antichain. An antichain is a subset A of S such that no two sets in A are a subset of one another.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: lorg</title>
		<link>http://www.algorithm.co.il/blogs/challenges/small-python-challenge-no-4-counting-sets/#comment-211</link>
		<dc:creator>lorg</dc:creator>
		<pubDate>Sun, 30 Aug 2009 20:19:02 +0000</pubDate>
		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=315#comment-211</guid>
		<description>By the way, to all of you:
Thank you for your solutions!</description>
		<content:encoded><![CDATA[<p>By the way, to all of you:<br />
Thank you for your solutions!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: lorg</title>
		<link>http://www.algorithm.co.il/blogs/challenges/small-python-challenge-no-4-counting-sets/#comment-210</link>
		<dc:creator>lorg</dc:creator>
		<pubDate>Sun, 30 Aug 2009 20:04:47 +0000</pubDate>
		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=315#comment-210</guid>
		<description>eric &amp; Shenberg:

Both of your solutions are equivalent, and they both work best when the set to test is small. This means that it will probably work best when the set to test is common. When it is rare, it will take more time.

The solution I originally thought of has it the other way around, although it has it&#039;s drawbacks. I need to do a more thorough complexity analysis of both algorithms to be able to compare them correctly.

Another note regarding the problem where I needed this - in that problem I had to score a set according to its rarity. Since very common sets score low, a possible improvement for the real-life problem might be to return TOO_MANY (equivalent to zero score) for sets that appear more times than some threshold.
Regardless, I&#039;m still very curious to see more solutions to the full problem as stated in the post.</description>
		<content:encoded><![CDATA[<p>eric &#038; Shenberg:</p>
<p>Both of your solutions are equivalent, and they both work best when the set to test is small. This means that it will probably work best when the set to test is common. When it is rare, it will take more time.</p>
<p>The solution I originally thought of has it the other way around, although it has it&#8217;s drawbacks. I need to do a more thorough complexity analysis of both algorithms to be able to compare them correctly.</p>
<p>Another note regarding the problem where I needed this &#8211; in that problem I had to score a set according to its rarity. Since very common sets score low, a possible improvement for the real-life problem might be to return TOO_MANY (equivalent to zero score) for sets that appear more times than some threshold.<br />
Regardless, I&#8217;m still very curious to see more solutions to the full problem as stated in the post.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Roee Shenberg</title>
		<link>http://www.algorithm.co.il/blogs/challenges/small-python-challenge-no-4-counting-sets/#comment-209</link>
		<dc:creator>Roee Shenberg</dc:creator>
		<pubDate>Sun, 30 Aug 2009 08:36:05 +0000</pubDate>
		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=315#comment-209</guid>
		<description>Well, I didn&#039;t read anyone else&#039;s comments except to make sure I&#039;m the first to write some code, but the solution I chose is this:

create a dictionary of for each integer in one of the sets S, which sets it belongs to.
create a copy of S, let&#039;s call it B.
Then, for every integer in the set a, B = intersect B with the set of sets to which it belongs (from the dictionary).
return the length of  the resulting B

I hope my code can be more clear. This is almost entirely pasted out of my interpreter, so sorry about the tab issues.

[python]
def build_setdict(set_seq):
    set_dict = {}
    for some_set in set_seq:
            for value in some_set:
                    try:
                            set_dict[value].add(some_set)
                    except KeyError:
                            set_dict[value] = set([some_set])
    return set_dict

def get_containing_sets(sets, set_to_test):
	sets = frozenset(map(frozenset, sets))
	# integer-&gt;setlist mapping
	setdict = build_setdict(sets)
	remaining_sets = sets
	try:
		for i in set_to_test:
			remaining_sets = remaining_sets.intersection(setdict[i])
	except KeyError:
		# Integer in no set
		return set()
	return remaining_sets

#example test case
&gt;&gt;&gt; sets = [set(range(10)), set(range(10, 20)), set(range(20))]
&gt;&gt;&gt; len(get_containing_sets(sets, range(5)))
2
&gt;&gt;&gt; len(get_containing_sets(sets, range(5,15)))
1
&gt;&gt;&gt; len(get_containing_sets(sets, range(5,25)))
0
&gt;&gt;&gt; len(get_containing_sets(sets, range(11,16)))
2
[/python]</description>
		<content:encoded><![CDATA[<p>Well, I didn&#8217;t read anyone else&#8217;s comments except to make sure I&#8217;m the first to write some code, but the solution I chose is this:</p>
<p>create a dictionary of for each integer in one of the sets S, which sets it belongs to.<br />
create a copy of S, let&#8217;s call it B.<br />
Then, for every integer in the set a, B = intersect B with the set of sets to which it belongs (from the dictionary).<br />
return the length of  the resulting B</p>
<p>I hope my code can be more clear. This is almost entirely pasted out of my interpreter, so sorry about the tab issues.</p>
<p>[python]<br />
def build_setdict(set_seq):<br />
    set_dict = {}<br />
    for some_set in set_seq:<br />
            for value in some_set:<br />
                    try:<br />
                            set_dict[value].add(some_set)<br />
                    except KeyError:<br />
                            set_dict[value] = set([some_set])<br />
    return set_dict</p>
<p>def get_containing_sets(sets, set_to_test):<br />
	sets = frozenset(map(frozenset, sets))<br />
	# integer-&gt;setlist mapping<br />
	setdict = build_setdict(sets)<br />
	remaining_sets = sets<br />
	try:<br />
		for i in set_to_test:<br />
			remaining_sets = remaining_sets.intersection(setdict[i])<br />
	except KeyError:<br />
		# Integer in no set<br />
		return set()<br />
	return remaining_sets</p>
<p>#example test case<br />
&gt;&gt;&gt; sets = [set(range(10)), set(range(10, 20)), set(range(20))]<br />
&gt;&gt;&gt; len(get_containing_sets(sets, range(5)))<br />
2<br />
&gt;&gt;&gt; len(get_containing_sets(sets, range(5,15)))<br />
1<br />
&gt;&gt;&gt; len(get_containing_sets(sets, range(5,25)))<br />
0<br />
&gt;&gt;&gt; len(get_containing_sets(sets, range(11,16)))<br />
2<br />
[/python]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: lorg</title>
		<link>http://www.algorithm.co.il/blogs/challenges/small-python-challenge-no-4-counting-sets/#comment-208</link>
		<dc:creator>lorg</dc:creator>
		<pubDate>Sat, 29 Aug 2009 18:08:10 +0000</pubDate>
		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=315#comment-208</guid>
		<description>eric:
That&#039;s a very nice solution. I like how the complexity is dependent on the cardinality of a.</description>
		<content:encoded><![CDATA[<p>eric:<br />
That&#8217;s a very nice solution. I like how the complexity is dependent on the cardinality of a.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

