What Is Anti-Spam | Part 1

What Is Anti-Spam | Part 1

What Is Anti-spam?

Anti-spam is a set of systems and technical and legal fight against spam (emails unsolicited advertising).

The relevance of anti-spam solutions:

Before the year 2000, the spam was seemingly harmless mail. Indeed, most spammers were using this medium to promote products of any kind (pharmaceuticals, fake degrees, pirated software, pornography, etc.). But with the ever increasing volume of spam transiting the Internet (over 75% of messages), and with the arrival of types of spam most perverse, such as phishing, where the financial security of an individual is jeopardized, it became very important to guard against this nuisance.

The importance of having an updated system

A priori, the spammers may seem rather silly to spend their time sending millions of emails every day, day after day. It is not. The spam industry is flourishing, and it is easy to make lots of money in a short time: the cost of an email is virtually zero and low, even very low rates of return sufficient to ensure profitability project. Spammers are also ingenious in finding new spamming techniques and foil the existing anti-spam systems. In fact, spammers and their opponents vying in ingenuity to thwart the techniques of the other party, hence the importance of keeping its protection system to date: the more the system is updated, it is better protected, and the number of false negatives is reduced.

Methods: Anti-spam

Although often differ in the use, implementation and cost solutions to combat spam are much the same techniques to distinguish spam from legitimate mail. These techniques can be implemented either at the level of Internet service providers to protect their email, either at the user level by appropriate tools (anti-spam filters). The filter is usually located at the MTA (Mail Transfer Agent) receiving the email.

These techniques can be either preventive (marking letter to indicate it is spam) or curative (blocking or removal of the offending messages to the sender). This has disadvantages because the recipient must be a master of letters he wishes to receive. Also send a message may worsen the situation by occupying just over the network with high probability that the author of spam has masked his true address or used the address to third (completely innocent) as address back. Moreover, this approach tells the spammer that the address provided is active, which often increases shipments.

Several control techniques against spam are possible and can be cumulated: statistical analysis (Bayesian) filtering by keyword or by author, white lists (including persons or machines authorized to publish in some places) blacklists (designation people or machines to which it is forbidden to publish in some places), real-time interrogation of dedicated servers in the fight against spam.

These control techniques, like antivirus, must constantly adapt as new types of spam are able to circumvent these defenses.

These tools can be divided into two groups: the filter envelope, and content filtering. The header of the mail is the basic information of the latter: sender, recipient, copy, blind carbon copy, timing, source server, issue. The content of the message is the message such as: text, images, HTML, etc..

Filtering envelope

The efficiency of the filter envelope is about 50% [ref. necessary]. This type of filtering applies only to the header of the message, which often contains enough information to distinguish a spam message. It attaches to the content of the email.

This technique has the advantage of being able to block emails before their body is sent, which greatly reduces the traffic on the SMTP gateway (since the body of the message is sent after the header has been received and accepted). Moreover, the rate of false positives in this type of filter is virtually zero: when a filter envelope has identified email as spam, it is seldom wrong.

Filtering content

Content filters scan the contents of messages and detect spam that managed to pass through the filter envelope. Content filtering is a little more sensitive than the filter envelope after all, the information conveyed through the message are subjective and what may appear as a spam content filter can be an email quite legitimate ( is what is called a false positive), and the reverse is also true (false negative). Content filtering can develop into several layers. For example, the filter can use antivirus software to analyze archived files if necessary using, e.g., a Bayesian analyzer (see below), and so on.

Bayesian filtering

The Bayesian spam filter (the mathematician Thomas Bayes) is a system based on large amounts of spam and legitimate emails to determine if an email is legitimate or not. To function properly, the body of spam (junk) and ham (legitimate mail) should ideally contain thousands of “specimens”.

The message identification is cut into pieces which are compared to the whole body of emails (spam or not), to determine the frequency of different songs in both categories. A statistical formula used to calculate the probability that the message is spam or not. When the probability is high enough, the Bayesian system categorizes the message as spam. Otherwise, he lets go. The probability threshold may be defined by the system administrator: to find the most effective threshold. The Bayesian approach is also used for other classifications of mail machines, especially in Lotus Notes.



Leave your comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.