[Deutsche Version]

b8

With b8 0.6, the database format has changed. If you update from a previous version, be sure to read Update from prior versions in the documentation!

From the readme: What is b8?

b8 is a statistical ("Bayesian" [1]) spam filter implemented in PHP 5. It is intended to keep your weblog or guestbook spam-free. The filter can be used anywhere in your PHP code and tells you whether a text is spam or not, using statistical text analysis. What it does is: you give b8 a text and it returns a value between 0 and 1, saying it's ham when it's near 0 and saying it's spam when it's near 1. See How does it work? for details about this.
Principally, It's a program like Bogofilter or SpamBayes, but it is not intended to classify emails. Therefore, the way b8 works is slightly different from email spam filters. See What's different? if you're interested in the details.

To be able to distinguish spam and ham (non-spam), b8 first has to learn some spam and some ham texts. If it makes mistakes when classifying unknown texts or the result is not distinct enough, b8 can be told what the text actually is, getting better with each learned text.

The whole documentation can be found in the readme.

Big thanks go to Gary Robinson, as his articles A Statistical Approach to the Spam Problem and Spam Detection describe the foundation for the math and algorithms used in b8.

[1] I'm not a mathematician, but as far as I can grasp it, the math used in b8 has not much to do with Bayes' theorem itself. So I call it a statistical spam filter, not a Bayesian spam filter.

Statistical Analysis

If you're interested in the performance of b8 and a discussion about the best settings for the filter, see the Statistical Discussion.

Getting involved

The current state of the code can be checked out from git://l3u.de/b8.git. Please always patch against git master!

Download

b8-0.6.1.tar.gz
57.3 KB, Last change: 2014-03-12

Older versions:
b8-0.6.tar.gz (53.6 KB)
b8-0.5.2.tar.gz (48.4 KB)
b8-0.5.1.tar.gz (48.4 KB)
b8-0.5-r1.tar.gz (43.8 KB)
b8-0.5.tar.gz (41.8 KB)

Older versions

The PHP 4 branch of b8

The PHP 4 compatible version of b8 can be found in branch_0.4.x (including version 0.3.3, the first version named "b8"). This branch is not maintained anymore.

bayes-php

The first b8 releases (named "bayes-php") are still availible in old_releases for historical reasons.

To all Windows users: perhaps, you don't know the file format ".tar.gz" … on UNIX systems, it is widely used. If you don't have a packing program that can handle it, I recommend 7-zip. It could happen that your packing program just extracts one file with the extension ".tar" from the archive. If so, this one has to be unpacked another time.

nasauber.de © 2016 by Tobias Leupold