<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Finding keywords using Python</title>
	<atom:link href="http://uswaretech.com/blog/2009/03/finding-keywords-using-python/feed/" rel="self" type="application/rss+xml" />
	<link>http://uswaretech.com/blog/2009/03/finding-keywords-using-python/</link>
	<description>Building Amazing Webapps</description>
	<lastBuildDate>Sat, 13 Mar 2010 00:26:09 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: buy_vigrxplus</title>
		<link>http://uswaretech.com/blog/2009/03/finding-keywords-using-python/comment-page-1/#comment-1931</link>
		<dc:creator>buy_vigrxplus</dc:creator>
		<pubDate>Tue, 14 Jul 2009 12:47:38 +0000</pubDate>
		<guid isPermaLink="false">http://uswaretech.com/blog/?p=214#comment-1931</guid>
		<description>&lt;p&gt;Great post! I’ll subscribe right now wth my feedreader software!&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Great post! I’ll subscribe right now wth my feedreader software!</p>]]></content:encoded>
	</item>
	<item>
		<title>By: thecapacity &#187; Blog Archive &#187; CouchDB Performance or Use a File</title>
		<link>http://uswaretech.com/blog/2009/03/finding-keywords-using-python/comment-page-1/#comment-610</link>
		<dc:creator>thecapacity &#187; Blog Archive &#187; CouchDB Performance or Use a File</dc:creator>
		<pubDate>Fri, 20 Mar 2009 02:49:44 +0000</pubDate>
		<guid isPermaLink="false">http://uswaretech.com/blog/?p=214#comment-610</guid>
		<description>&lt;p&gt;[...] samples and feeling like I&#8217;d gotten my legs underneath me I decided to &#8220;port&#8221; a nice little example over to couchDB. If you want to play along at home then you&#8217;ll want to check out the article [...]&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>[...] samples and feeling like I&#8217;d gotten my legs underneath me I decided to &#8220;port&#8221; a nice little example over to couchDB. If you want to play along at home then you&#8217;ll want to check out the article [...]</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Joe Ganley</title>
		<link>http://uswaretech.com/blog/2009/03/finding-keywords-using-python/comment-page-1/#comment-314</link>
		<dc:creator>Joe Ganley</dc:creator>
		<pubDate>Wed, 11 Mar 2009 15:50:26 +0000</pubDate>
		<guid isPermaLink="false">http://uswaretech.com/blog/?p=214#comment-314</guid>
		<description>&lt;p&gt;You could do this in just a few lines of code by using a more Pythonic, functional style instead of this very imperative style; see, e.g., http://digitalhistoryhacks.blogspot.com/2006/08/easy-pieces-in-python-word-frequencies.html&lt;/p&gt;

&lt;p&gt;The sort_func is totally unnecessary; just do test_word_ba_list.sort(key = operator.itemgetter(1))&lt;/p&gt;

&lt;p&gt;Also, you probably want to remove stop words (a, and, the, etc.).&lt;/p&gt;

&lt;p&gt;Finally, raw term frequency isn&#039;t the best measure to identify the most important keywords; instead, for example, you might use the term&#039;s frequency in the document divided by its frequency in English overall (or across a larger corpus of documents).&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>You could do this in just a few lines of code by using a more Pythonic, functional style instead of this very imperative style; see, e.g., <a href="http://digitalhistoryhacks.blogspot.com/2006/08/easy-pieces-in-python-word-frequencies.html" rel="nofollow">http://digitalhistoryhacks.blogspot.com/2006/08/easy-pieces-in-python-word-frequencies.html</a></p>

<p>The sort_func is totally unnecessary; just do test_word_ba_list.sort(key = operator.itemgetter(1))</p>

<p>Also, you probably want to remove stop words (a, and, the, etc.).</p>

<p>Finally, raw term frequency isn&#8217;t the best measure to identify the most important keywords; instead, for example, you might use the term&#8217;s frequency in the document divided by its frequency in English overall (or across a larger corpus of documents).</p>]]></content:encoded>
	</item>
</channel>
</rss>
