<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Database Design Problem</title>
	<atom:link href="http://www.algorithm.co.il/blogs/programming/design/database-design-problem/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.algorithm.co.il/blogs/startup/database-design-problem/</link>
	<description>Algorithms, for the heck of it</description>
	<lastBuildDate>Tue, 21 Jun 2011 21:07:08 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
	<item>
		<title>By: Algorithm Blogs &#187; Blog Archive &#187; Actual Data Always Needs To Be Explicit</title>
		<link>http://www.algorithm.co.il/blogs/startup/database-design-problem/#comment-166</link>
		<dc:creator>Algorithm Blogs &#187; Blog Archive &#187; Actual Data Always Needs To Be Explicit</dc:creator>
		<pubDate>Fri, 10 Apr 2009 21:01:13 +0000</pubDate>
		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=178#comment-166</guid>
		<description>[...] encountered this issue when I implemented my multiple source db-design. A reminder: I had data collected from various sources, and then combined together to a final [...]</description>
		<content:encoded><![CDATA[<p>[...] encountered this issue when I implemented my multiple source db-design. A reminder: I had data collected from various sources, and then combined together to a final [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: lorg</title>
		<link>http://www.algorithm.co.il/blogs/startup/database-design-problem/#comment-165</link>
		<dc:creator>lorg</dc:creator>
		<pubDate>Wed, 25 Feb 2009 23:47:08 +0000</pubDate>
		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=178#comment-165</guid>
		<description>@Jay:
Actually, before I started working on the database, I considered using couchdb, and rejected it. I did so because:
1. It didn&#039;t seem &quot;ready enough&quot; for me.
2. I did not find anyone I know who already worked with it (or knew about it, for that matter. I was the one telling them about it!).
3. My hosting doesn&#039;t support it.
4. Relational databases are good enough.
Now, under other circumstances, being the first one &quot;in my neighborhood&quot; to use it could be really fun, but I thought that for my startup it isn&#039;t the right way to go. It is an unnecessary risk, with not much to gain (See point no. 4).
So I decided that for this startup I&#039;d keep to sql, and for my next fun project I would acquaint myself with couchdb.

As for your actual suggestion: I&#039;m not well versed in couchdb usage, but I&#039;m not sure I like it. Since when querying for records I usually don&#039;t want to get the extra baggage of source records/data. I know I might be optimizing prematurely but I think this solution is a bit too heavy. Also, it seems to me this solution would keep source data and final data in different forms, which would make it harder to process.</description>
		<content:encoded><![CDATA[<p>@Jay:<br />
Actually, before I started working on the database, I considered using couchdb, and rejected it. I did so because:<br />
1. It didn&#8217;t seem &#8220;ready enough&#8221; for me.<br />
2. I did not find anyone I know who already worked with it (or knew about it, for that matter. I was the one telling them about it!).<br />
3. My hosting doesn&#8217;t support it.<br />
4. Relational databases are good enough.<br />
Now, under other circumstances, being the first one &#8220;in my neighborhood&#8221; to use it could be really fun, but I thought that for my startup it isn&#8217;t the right way to go. It is an unnecessary risk, with not much to gain (See point no. 4).<br />
So I decided that for this startup I&#8217;d keep to sql, and for my next fun project I would acquaint myself with couchdb.</p>
<p>As for your actual suggestion: I&#8217;m not well versed in couchdb usage, but I&#8217;m not sure I like it. Since when querying for records I usually don&#8217;t want to get the extra baggage of source records/data. I know I might be optimizing prematurely but I think this solution is a bit too heavy. Also, it seems to me this solution would keep source data and final data in different forms, which would make it harder to process.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jay</title>
		<link>http://www.algorithm.co.il/blogs/startup/database-design-problem/#comment-164</link>
		<dc:creator>Jay</dc:creator>
		<pubDate>Wed, 25 Feb 2009 17:43:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=178#comment-164</guid>
		<description>Just something to consider, but this type of &quot;non-normal&quot; data structure is something that couchdb would be great at.
You could have a &quot;generic entity&quot; entry and then programmatically check for the existence of other document attributes and expand your behavior if they exist.</description>
		<content:encoded><![CDATA[<p>Just something to consider, but this type of &#8220;non-normal&#8221; data structure is something that couchdb would be great at.<br />
You could have a &#8220;generic entity&#8221; entry and then programmatically check for the existence of other document attributes and expand your behavior if they exist.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: lorg</title>
		<link>http://www.algorithm.co.il/blogs/startup/database-design-problem/#comment-163</link>
		<dc:creator>lorg</dc:creator>
		<pubDate>Mon, 23 Feb 2009 07:49:46 +0000</pubDate>
		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=178#comment-163</guid>
		<description>@jack9:
First, thank you for your comment and proposed solution.

Regarding normalization, I just didn&#039;t describe it to clarify of the actual problem and proposed solutions.

Regarding your solution, since I ended up using solution 1**, I had a one-to-many mapping between items and sources, not many-to-many as you suggest. If I understand your idea correctly, you suggest keeping only the combined (final) data, and using a join table to keep track of all sources involved in creating a combined record.
There are two problems with this: (a) You don&#039;t keep the source data, for later analysis. If you do, I don&#039;t see a way to differentiate between source records and final records. (b) For a final record, you can&#039;t seem to be able to tell for a single field what is its source. You can change your solution by adding a field_name to itemfromsource, but I don&#039;t think the resulting solution is pretty elegant.</description>
		<content:encoded><![CDATA[<p>@jack9:<br />
First, thank you for your comment and proposed solution.</p>
<p>Regarding normalization, I just didn&#8217;t describe it to clarify of the actual problem and proposed solutions.</p>
<p>Regarding your solution, since I ended up using solution 1**, I had a one-to-many mapping between items and sources, not many-to-many as you suggest. If I understand your idea correctly, you suggest keeping only the combined (final) data, and using a join table to keep track of all sources involved in creating a combined record.<br />
There are two problems with this: (a) You don&#8217;t keep the source data, for later analysis. If you do, I don&#8217;t see a way to differentiate between source records and final records. (b) For a final record, you can&#8217;t seem to be able to tell for a single field what is its source. You can change your solution by adding a field_name to itemfromsource, but I don&#8217;t think the resulting solution is pretty elegant.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jack9</title>
		<link>http://www.algorithm.co.il/blogs/startup/database-design-problem/#comment-162</link>
		<dc:creator>jack9</dc:creator>
		<pubDate>Mon, 23 Feb 2009 03:33:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.algorithm.co.il/blogs/?p=178#comment-162</guid>
		<description>Why not -

source
{
id
type (website or ... can be normalized to type_id)
name
... (timestamp or whatever else is source appropriate)
}

item
{
id
name
...(whatever is unique across all sources)
}

itemfromsource
{
id
source_id
item_id
}</description>
		<content:encoded><![CDATA[<p>Why not -</p>
<p>source<br />
{<br />
id<br />
type (website or &#8230; can be normalized to type_id)<br />
name<br />
&#8230; (timestamp or whatever else is source appropriate)<br />
}</p>
<p>item<br />
{<br />
id<br />
name<br />
&#8230;(whatever is unique across all sources)<br />
}</p>
<p>itemfromsource<br />
{<br />
id<br />
source_id<br />
item_id<br />
}</p>
]]></content:encoded>
	</item>
</channel>
</rss>

