My not so uninteresting notes

To content | To menu | To search

Thursday 4 September 2008

Fighting spam part 1: Spamtrap

Why do we need to train the filter

Bayesian filters use a statistical approach to classify emails, in order to make it works you need to train the filter at the beginning with both know spam and not spam (ham) emails so that the filter knows which events are statistically present in spam emails and which are not. This is often done by the administrator (otherwise the bayesian part is not activated in most filters) but the day to day training is not so often done and not so well which leads to reduced filter efficiently as time goes by.

But in fact it's very important that the filter stays up to date with new spam messages so that it can gather new hints of spams and stays at the top. If the filter is not usually fed continuously with new spam messages it's because the task is not so easy.

Continue reading...

Tuesday 2 September 2008

Fighting spam part 0: Introduction

I am about to write a few articles about not so bad technics to fight efficiently spam, along the past years I developped some technics to fight spam. The latest ones seems to provide a high ratio in term of efficiency it means high quantity of spam catched and almost no false positive. I started developping this for my own personnal domain and due to my current job expand and enhance this for the company where I work for.

At the beginning it was quite simple because for my personnal use, I work with thunderbird and it includes since a long time a very good spam filter which require not so much trainning before achieving a very good filter quality and so I didn't worried much about the quality of filtering done right on the server by the SPAM filter.

But, alas, thunderbird (as many other opensource project btw) is not corporate enougth and we are stuck with outlook ... The Junk filter of the latest is rather complicated and rather unusefull. So if you want to reduce the cries of the users about SPAM you have to find a good solution on the server.

The technics that I'll present are built around spamassassin and bayesian filtering, that's not revolutionnary technologies but with a fairly good (and not complicated) and quick tuning you can acheive a very good result.

It might seems unlogical (and it is a little bit) but I'll start this serie by an article on how to train automaticaly an already running spam filter based on bayesian filtering, article about how to setup it will follow but a bit later. My reason for this is that there is tons of guides on Internet on how to setup bayes in spamassassin, whereas articles on how to train it (without the help of the standard users feedback) are rare.

Part 1: setting a spamtrap

Monday 23 June 2008

Microsoft my worst nightmare part 1.

Intro

It is going to be a long story with a high number of sequels it seems.

I must confess that I do not have a high esteem for Microsoft products in general but my day work force me to use them or at least support user using it and more often than not I face real stupidity in the product.

Right now my key target is Outlook from Office 2003 edition.

Using Outlook, so remove Outlook Express ?

Well it might seems logical that if you know for sure that you will use Outlook then you won't need Outlook express. If you were ready like me to remove this component with image creation tools like nlite I should not recommend you to do so !

Why ? because if you do so and try to access to a IMAP/POP3 server, it will not work because you'll need registered DLL that comes with Outlook Express and could not be provided by Outlook. Sounds good ! I spend a few hours last week on this and find no way real way to escape ! (copying DLL failed because they need to be registered, copying + trying to register with regsrv32 failed on registration, reinstallation of Outlook failed). Don't get me wrong I didn't say that you can't manage is some situation to manage to register or install outlook (especially if you do not have installed security fixes but you might run into troubles.

There must be a good reason for this, and I can understand that maybe Microsoft guys wanted to mutualise code between two versions, cool good idea, but when you install Office if one or more DLL is missing then setup should install it and do what ever is needed so that thoses DLL will be installed and registered !

Sunday 22 June 2008

L4SUS update

My first public release included some bugs.

I just release right now the version 0.02, it fixes lots of typo in my first release.

The whole system is pretty stable now, but it rely on automation both on the client and the server so now it's time to move on something else:

  • Create installer for windows using something like nsis so that simple configuration and scheduled tasks can be created automaticaly
  • See with OCS Ng what can be done to go further in the automation
  • Use offline database wsusscn2.cab to save bandwith when checking for updates (at least as an option)
  • Dig more deeply in the documentation of WSUS API in order to have more informations about updates.

The lastest version is here

To be continued ...

Sunday 1 June 2008

L4SUS

From the theory to practice there is a huge gap use to say my teachers.

Well I faced them well trying for real the script described here in short every thing was mostly but in order to get something that is really exploitable more efforts were needed.

Now it's done and I packaged every thing in the zipfile attached to this post. This is mostly 3 scripts (and a few subscripts) bundled together, it require a samba server. I called this L4SUS and it stands for Linux For Server Update Service.

Using L4SUS should be quite simple:

  • Extract all .vbs script in a folder on each computer you want to manage updates
  • Rename updatelist.conf.example to updatelist.conf and adapt configuration (ie. the name of the samba server and the root path serving update)
  • Install perl script in the samba server, make it executable (chmod a+x) and adapt paths at the top of the script ($dest_base_dir and $update_file_dir)
  • Verify that Windows update is configurated to search (and only search) from udpates
  • On the samba server create a directory called files in the directory pointed by $dest_base_dir, and for each computer a directory of the name of the computer also is the directory pointed by $dest_base_dir

The most tricky part is that $dest_base_dir (in download_winupdate) must be exported as the value of filePath (in updatelist.conf).

Main components of this systems are :

  • getupdatelist.vbs, this script search for applicable updates on the computer where it is running, it create a file in its own folder called yyyymmddproposedupdate.log which contains a list of all updates, their id and their file's url.
  • download_winupdate, provide a computer name to this script and it will parse the latest proposedupdate.log file, then download missing updates files and create a command list:: upatelist which indicate to doupdates.vbs how to do the updates
  • doupdates.vbs, this script execute as most quietly as possible the different updates listed in updatelist

You can get a nearly automatic systems by scheduling the scripts via cron and windows scheduler. Of course in this case it does mostly the same as using directly windows update (well it should use less bandwidth but it seems not as clever as windows update when it comes to do all the updates quietly ...).

Expect more updates soon, because it still a bit rough and should need more polish.

Sunday 18 May 2008

Listing Windows Updates for fun ... and profit

A couple of month ago I was searching for a solution for managing windows updates (and maybe more).

Out of the box you've got two solutions :

  • Standard Windows Update mechanisms
  • WSUS (Windows Server Update Service)

Both were not ok for my needs in a small sized company, here are the reasons:

Standard Windows Update

When you have non IT users (which is the case of nearly every companies) you must enable automatic updates.

Main drawbacks of this methods is that you don't control which updates are installed and which are not and each computers download a copy from internet which is inefficient and a pure waste of bandwidth and could even be a big problems when the size of company grow beyond a few tens of users.

WSUS

WSUS is a good solution from Microsoft to address the problems of standard update.

You setup the service and it will manage to find available updates, then you select the one you want and they will be downloaded. On the client you just have to change the address of the update server to point to your own update server and voilà everything is working !

But It oblige you to have a Windows 2000 or 2003 server and I really hate the strategy of lock down done by Microsoft.

As both solutions didn't suits my needs, I started looking for others. I found LSUS which is available into Samba-edu as it is an opensource project, I am pretty sure that it is quite easy to extract the LSUS part but I decided not to investigate more in this way.

At this moment I decided to investigate different solution and through Windows Update API manage to have something even not complete. The script listupdates.vbs is this result. This script for the moment just output the name and the url for the different updates, but it should not be very difficult to add the missing parts.