Joomla extensions have moved!

Galaxiis (formely creates industry leading premium Joomla Extensions and is the longest running Joomla extensions provider since 2005.

Powerful Joomla extensions. - Excellent documentation. - Amazing support.

Visit now

As I am in the middle of the development (a little bit more than 60% done) of my PHP Bayesian Naive Filter (a learning filter against spams comment, guestbook, and posting in general) for Joomla/Mambo and after reading some paper found on internet:

  • On Attacking Statistical Spam Filters, Gregory L. Wittel and S. Felix Wu - Department of Computer Science - University of California, Davis One Shields Avenue, Davis, CA 95616 USA
  • A Naive Bayes Spam Filter, Kai Wei This email address is being protected from spambots. You need JavaScript enabled to view it. CS281A Project, Fall 2003
  • But there is more...

I decide that my project will be certainly a failure if I rewrite or reuse a Bayesian filter engine which is not accurate or using the latest countermeasures. Since I do not want to develop during 3 years an effective filter (will I ever be able to do it???), I came across the idea of implementing the component com_bayesiannaivefilter in such a way that I can abstract the core engine and use the work done by the best open source project.

It is also clear for me since the beginning that a spam filter must be trained on a very large data volume (more than 1000 messages, the more the better) in order to categorize the message with accuracy. Webservices will have my preference as an internet entities with the require cpu horsepower and data store should be able to offer the best categorizing messages efficiency....

My component will be able to use following Bayesian Naive Filter core: (planned but not done, I it is technically possible I will do it)

Plugins Remarqs Possible open source project
JAVA I am a J2EE developer, Back to the roots :-) Som. to propose? contact me!
PHP Core done, but very simple tokenization and hashing of message
Volume of data small
Som. to propose? contact me!
PERL Can PHP call perl code? Som. to propose? contact me!
CGI-BIN Should be easy to do Som. to propose? contact me!
WEBSERVICES Should be easy as soon as we found a WS provider
Data volume?
Som. to propose? contact me!

Each technology may contains many core engine, or different versions. I will fill this table with possible candidate (You can heelp me by suggesting or speeding development).

Core requirement:

  • Use mySQL,
  • Most of internet provider allow the use of CGI-BIN, PERL, JAVA

This project will be soon committed to Joomla forge!

You might like also

No Thumbnail was found
Do You think it could be a great idea to have a spamming filter in Mambo in order to reduce spam tentatives? It is one of the most effective methods available right now is Bayesian filteringA spam filter that evaluates email message content to determine the probability that it is spam. Bayesian filters are adaptable and can learn to identify new patterns of spam by analyzing incoming email.. Instead of identifying subject line or headers of the email, Bayesian filter …
5288 Days ago