Imagine a filter containing more than 1 call to check_bayes function provided in bayes plugin (for instance 9 rules similar to those included in Debian or Ubuntu SpamAssassin filters). Every time check_bayes function is called, the probability of the email being spam is computed involving some searching processes on a berkeley database. This situation involves a lot of unneeded computational process. Why compute the same probability 9 times?. In order to solve this problem, we had implemented a small configurable cache. Using a cache able to store 1 element, the problem is avoided. Moreover, the cache can be useful to get classified the same emails delivered to more than one addresses in the same domain (filtered with the same filter).
Now, a filter with 9 bayes rules is executed in less than 35 miliseconds. Really fast. I want to congratulate us :).
Bayes Cache
Comments are off for this post