<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: A better way to index text data</title>
	<atom:link href="http://itknowledgeexchange.techtarget.com/sql-server/a-better-way-to-index-text-data/feed/" rel="self" type="application/rss+xml" />
	<link>http://itknowledgeexchange.techtarget.com/sql-server/a-better-way-to-index-text-data/</link>
	<description></description>
	<pubDate>Tue, 24 Nov 2009 07:48:43 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.2</generator>
		<item>
		<title>By: Mrdenny</title>
		<link>http://itknowledgeexchange.techtarget.com/sql-server/a-better-way-to-index-text-data/#comment-78</link>
		<dc:creator>Mrdenny</dc:creator>
		<pubDate>Thu, 31 Jul 2008 02:54:17 +0000</pubDate>
		<guid isPermaLink="false">http://itknowledgeexchange.techtarget.com/sql-server/a-better-way-to-index-text-data/#comment-78</guid>
		<description>If there was a requirement for breaking the data out, that could be an excellent solution.  If your system requirements are for a single column where the email address is a username for example, then breaking the column apart may not be the best option in that case.

Thanks for showing an additional method which can be used.

If anyone else has other options please post them.</description>
		<content:encoded><![CDATA[<p>If there was a requirement for breaking the data out, that could be an excellent solution.  If your system requirements are for a single column where the email address is a username for example, then breaking the column apart may not be the best option in that case.</p>
<p>Thanks for showing an additional method which can be used.</p>
<p>If anyone else has other options please post them.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: DMUNSEEN</title>
		<link>http://itknowledgeexchange.techtarget.com/sql-server/a-better-way-to-index-text-data/#comment-75</link>
		<dc:creator>DMUNSEEN</dc:creator>
		<pubDate>Wed, 30 Jul 2008 07:39:22 +0000</pubDate>
		<guid isPermaLink="false">http://itknowledgeexchange.techtarget.com/sql-server/a-better-way-to-index-text-data/#comment-75</guid>
		<description>After thinking about this for a while, I suspect that for email addresses there is probably a more complete solution (where you can also use other types of matching than just equality through a hash column) that is also scalable:
Just (partly) normalize all the different parts of the email address (e.g. create one or more tables like email_prefix,email_domain or email_extension). Fora large set of email addresses this would give you a considerable size reduction since you would store each part of the email address only once, and you do not need to store the email address @ and dot character.
Since this means you can now index the individual emailaddress parts more easilty It would also make partial matches to email address(parts) possible.

my suggestion is not practical for small sets since it would actually increase not only the total size of the data, but it would also make some matching queries needlessly complex.</description>
		<content:encoded><![CDATA[<p>After thinking about this for a while, I suspect that for email addresses there is probably a more complete solution (where you can also use other types of matching than just equality through a hash column) that is also scalable:<br />
Just (partly) normalize all the different parts of the email address (e.g. create one or more tables like email_prefix,email_domain or email_extension). Fora large set of email addresses this would give you a considerable size reduction since you would store each part of the email address only once, and you do not need to store the email address @ and dot character.<br />
Since this means you can now index the individual emailaddress parts more easilty It would also make partial matches to email address(parts) possible.</p>
<p>my suggestion is not practical for small sets since it would actually increase not only the total size of the data, but it would also make some matching queries needlessly complex.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
<!-- dynamic -->