With b8 0.7, the code base has been refactored fundamentally. If you update from a previous version, be sure to read Update from prior versions in the documentation!

b8 is a statistical ("Bayesian" [1]) spam filter implemented in PHP. It is intended to keep your weblog or guestbook spam-free. The filter can be used anywhere in your PHP code and tells you whether a text is spam or not, using statistical text analysis. What it does is: you give b8 a text and it returns a value between 0 and 1, saying it's ham when it's near 0 and saying it's spam when it's near 1. See How does it work? for details about this.
Principally, It's a program like Bogofilter or SpamBayes, but it is not intended to classify emails. Therefore, the way b8 works is slightly different from email spam filters. See What's different? if you're interested in the details.

To be able to distinguish spam and ham (non-spam), b8 first has to learn some spam and some ham texts. If it makes mistakes when classifying unknown texts or the result is not distinct enough, b8 can be told what the text actually is, getting better with each learned text.

Big thanks go to Gary Robinson, as his articles A Statistical Approach to the Spam Problem and Spam Detection describe the foundation for the math and algorithms used in b8.

[1] I'm not a mathematician, but as far as I can grasp it, the math used in b8 has not much to do with Bayes' theorem itself. So I call it a statistical spam filter, not a Bayesian spam filter.

If you're interested in the performance of b8 and a discussion about the best settings for the filter, see the Statistical Discussion.

The code is managed using Git and can be found on GitLab at


Current release: b8-0.7.tar.xz
40.1 KB, released: 2020-03-18

Signature file:b8-0.7.tar.xz.asc

b8-0.6.2.tar.xz | 43.5 KB | 2019-02-08Signature file | Checksums
