5. Junk Mail - Autumn 2006
The global volume of spam is increasing, so even though we have maintained a constant accuracy level, the number of spam emails reaching users' inboxes has undoubtably increased. One of the more conservative estimates around (from Symantec) puts the increase in spam emails at about 30% in the last two months (September and October 2006). See:
Fighting spam is an arms race; when spammers change tactics there will inevitably be a delay before the anti-spam community responds effectively. There are four aspects to the problem (and the way that we are tackling the solutions):
- Outright rejections based on malware and basic SMTP errors
- For details of outright rejections see http://www.oucs.ox.ac.uk/network/smtp/relay/ We currently reject a staggering 3 out of every 4 emails before they enter the Oxford mail system! This is mainly because of reasons including invalid address information and critical breaches of the email sending protocols. There is also a rather useful real-time information page which you see from the 'rejections' link (section 5) on the web page at http://www.oucs.ox.ac.uk/network/smtp/relay/stats/
- Spam scoring (enabling basic spam filtering)
- Spam scoring is part of the 'arms race' referred to above. We use a system called SpamAssassin (SA) to add the spam scores upon which a lot of users rely. We have introduced a further set of SA rules to try to cope with some of the newer, cleverer tactics. This is an ongoing process, but the last few days have seen these rules applying higher scores to many more spam emails that may previously have escaped high scores.
- Client end spam filtering that 'learns'
- Client-side spam filtering that 'learns'. This is also known as Bayesian filtering. This is remarkably effective, as a set of rules are constantly tailored by the user. It's probably the best method, but also relies upon spam scoring. It is usually done by the client software, but can be done on a group basis (as close colleagues may wish to have similar rules). Good email clients are able to do it and it is of a level of complexity with which most users should be able to cope.
We hope these explanations help. A balance needs to be struck between heavy-handed filtering and rejection at the server, and something that does not inundate users with spam. The recent improvements in spam scoring should help right now, but it is a constant battle. Hopefully, a lot of the more 'clever' spam emails will now be detected (e.g. the stocks and shares emails that are so difficult to compose rules for).
 The latest rules are drawing on the knowledge base at SpamAssassin Rules Emporium (http://www.rulesemporium.com/), which is basically a repository of custom rules.
Up: Contents Previous: 4. Preventing Automatic Display of Offensive Content