Wirebrush Architecture
Wirebrush design is inspired in a highly extensible plugin architecture. We develop a Wirebrush core plugin able to accurately coordinate all parsers, filtering functions and listeners developed in Wirebrush plugins. Please refer to our WB4Spam developers guide to develop your own plugins.
Rule format
A rule in Wirebrush for Spam is as simple as in SpamAssassin:
<parser_type> <rulename> <ruledefinition> score <rulename> <rulescore> [describe <rulename> <ruledescription>]
For instance, we can define a rule as:
body CONTAINS_VIAGRA eval('viagra') score CONTAINS_VIAGRA 20 describe CONTAINS_VIAGRA The body of the e-mail contains the word 'viagra'
Diferences between Wirebrush rules and SpamAssassin rules
Differences between Wirebrush rules and SpamAssassin ones are very limited and include the following ones:
- Score for each rule is only one (SpamAssassin supports 4 scores to support different plugin activation schemes)
- All Wirebrush rules includes a function call. Regular expression based rules are also function calls on Wirebrush (eval(‘<regex>’) for instance).
- Wirebrush includes support for PCRE regular expressions and POSIX regex through the functions: eval, eval_header, pcre_eval and pcre_eval_header.
- We are evaluating the removal of the parser-name field from rules. If we implement this, the parser for a certain function will be specified in the function implementation. So the function implementation will specify the parser required for function executing and there will be only 2 kinds of rules: common and meta rules.
Plugins and functions
Wirebrush includes three types of plugins and functions:
- parser plugins and parsers: Parser plugins are sets of parsers. Parsers are able to extract from RFC 2822 some useful data for using in filtering functions. For instance, we develop a eml_structure_parser plugin that includes the parsers header, body and full.
- Filtering plugins and functions: Filter plugins are sets of filtering functions. Filtering functions are able to identify a spam messages using certain features extracted by a parser from the original contents specified in RFC 2822.
- Event listeners: Filtering plugins can include listeners. Listeners are callbacks used to notify some filtering plugins about the filtering of a message. Everytime a message is classified, Wirebrush for Spam notify all callbacks in order to inform them about this decission. This feature is useful to develop some filtering schemes like AWL (Auto White List) or Bayes continuous updating schemes.
Wirebrush includes the following parser plugins:
- eml_structure_parser plugin: includes the parsers body, header and full that stands for the body of the message, the header and the full message. We are thinking about to replace parsers included in this plugin by rfc2822 parser reducing the computational cost of parsing. The parsers currently included in this plugin (body, header and full) are really one parsing scheme for the whole email content executed 3 times to extract body, header and full information from e-mail.
- url_parser plugin: includes the parser url that find from the entire message all URLs.
The following filtering plugins and functions are now included in Wirebrush for SPAM:
- regex: including the functions eval(<regex>) available for body parser and eval_header(<header>,<regex>) for header parser. Please note that string literals should be included into quotes.
- pcreregex: including the functions pcre_eval(<regex>) available for body parser and pcre_eval_header(<header>,<regex>) for header parser. Please note that string literals should be included into quotes.
- bayes: including the funtion bayes_eval(<min>, <max>) that check if the probability of the message being spam is included in the specified interval. This function should be only used with body parser.
- spf: including functions spf_fail([received_header_number]), spf_pass([received_header_number]), spf_softfail([received_header_number]), spf_none([received_header_number]) and spf_neutral([received_header_number]) that gets true when spf records are in the referenced state and the argument is optional. SPF plugin checks the first Received header in the e-mail. If you want to check a different Received header please specify the header number as parameter (order from the begining). These functions should only be used in conjunction with header parser.
- rxl: including functions rxl_check(<list_suffix>[, <number_received_heder>]) and rxl_check(<list_suffix>, <octect number>, <octect value>[, <number_received_heder>]).
Finally, remember that you can develop your plugins to address parsing and filtering. In order to do it, please refer our Wirebrush4SPAM developers guide.
Defining a filter
In order to define a filter you should include rules in *.cf files in the filter directory. A filter should also include the definition of the score required to classify a message as spam (requiered_score). We can also activate smart filter evaluation (SFE) in order to skip the execution of some rules.