nasauber.de

Blog

b8 0.7 out now

My oldest still maintained project b8, the statistical PHP spam filter, got an overall code refactoring and modernization. After all the years, this really was necessary!

Here's what has been changed, as to find in Update from prior versions in the readme:

Overall code rework

The code has been modernized a lot since the last release. Most notably, namespaces have been added. So, you have to instantiate b8 e. g. like this now:

$b8 = new b8\b8(...);

To use the constants, please also add the namespace, e. g. b8\b8::HAM.

Due to the namespace introduction, the default degenerator and lexer can't be called default anymore. The name is now standard (e. g. b8\lexer\standard).

Storage backend approach change

The storage backends now leave the connection to a database to the user (where it belongs). The Berkeley DB (DBA) storage backend remains the reference one. The other remaining one shows how to store b8's wordlist in a MySQL table, more as an example how to implement a proper storage backend. The base storage class now has all needed functions added as abstract definitions, so that everybody can easily implement their needed backend. Also, some function names have been changed to more meaningful ones.

The DBA backend now simply wants to have a working DBA resource as the only parameter. So if you use this, you would do e. g.:

$db = dba_open('wordlist.db', 'w', 'db4');
$config_dba = [ 'resource' => $db ];

and pass this to b8.

The (example) MySQL backend takes a mysqli object and a table name as config keys. Simply look at the backends themselves to see the changes.

If you implemented your own backend, you will have to update it. But this should be quite straightforward.

Please notice the newly added start_transaction() function. Actually, with MySQL's MyISAM engine that was the default back then, transactions didn't even exist (man, this project is actually quite old ;-)!

Additionally, the PostgreSQL backend and the original MySQL backend (using the long-deprecated mysql functions, not the mysqli ones) have been removed.

New default configuration

The default configuration of the lexer and the degenerator has also been changed.

The degenerator now uses multibyte operations by default. This needs PHP's mbstring module. If you don't have it, set multibyte to false in the config array.

Speaking of the lexer, the legacy HTML extractor has been removed, alongside with it's old_get_html config option.

Please update your configuration arrays!

Have a lot of fun with b8 :-)