Email Munging

      Link

This page illustrates a simple yet effective technique to hide an email address from spam webrobots.

The problem of spam

Spam - the unsolicited, mass-mailed email messages such as commercial ads, Make Money Fast letters, scams, and all other kinds of junk - has become a real plague of the Internet, as it is estimated that the vast majority of all sent mails nowadays are spam.

Spammers have several ways to collect valid email addresses, and one of their sources is Web pages. They use spamrobots that crawl over all the web pages they can reach, scanning them for email addresses, just like webspiders do in order to collect data for search engines. Any email address published on a webpage risks being collected and used as a target for mass mailing.

Here is explained a simple way to prevent this to happen.

Address obfuscation to confuse spam robots

Sometimes webmasters obfuscate email addresses in order to make them invisible to spambots, or to have the spambot pick up an invalid address. The most used techniques are:

1. Simple CAPTCHA

Examples:
user[at]domain.com
user@NOSPAMdomain.com
user@REMOVE_THISdomain.com
This technique has several disadvantages: is not standard-compliant (the advertised email address is effectively invalid), inaesthetic, annoying for the visitor which needs to edit the address by hand before sending a mail, error-prone (less tech-savvy visitor might send mail to the wrong address), and may be ineffective as some spambots are now able to parse correctly such addresses.

2. Advanced CAPTCHA

Examples:
user@11domain22.com (please remove all numbers)
user.domain.com (please replace first dot with @)
Unlike the previous technique, this protection is very effective but still bears all the other disadvantages.

3. Image

Example:

This solution is effective but still non-standard as it makes the address invisible if the visitor is using a text-only browser or chose not to load remote images. Also, it makes impossible to add a mailto link, as it would thwart the protection.

4. JavaScript

Example:

In this case, JavaScript code is used to generate on-the-fly an email address text or mailto link. The code is obfuscated so the real email address is not visible. In fact, the above link is generated by this code:
<script type="text/javascript" language="javascript">
<!--
{
   link = 'user@d' + 'omain.com'
   document.write("<a href='mailto:"+link+"'>"+link+"</a>")
}
//--></script>
This soluction is effective, but the browser's visitor needs to have JavaScript enabled in order to be able to see the address.

An elegant, effective, and full-standard technique exists instead, and relies in the HTML standard:

5. HTML encoding

Any HTML character can be expressed by its numeric reference or entity. I.e., any character can be specified as &#n;, where n is the Unicode character value (this value mostly coincides with the ASCII 7-bit value). Therefore &#65; in the source of a HTML page shows an a when the page is visualized into a browser.

Hence, to hide an email address, one can just replace one or more letters with their numeric reference, e.g. user@domain.com may be transformed into &#117;ser@do&#109;&#65;in.com. From the point of view of human visitors this makes no difference. However, because spambots parse the HTML source of the page, they pick up the obfuscated (munged) address -- which is invalid.

This idea was suggested by Liam Quinn of HTML Help which posted it on comp.infosystems.www.authoring.html. As it is based on HTML language specifications (RFC 1866), it works with any browser and any OS.

It has been argued that it would be easy to program a spam robot to re-convert encodings into text again, hence thwarting the protection offered by munging. This is true in theory, but in practice spam robots never do this (not yet anyway). Taking the time to parse the text and to interpret all the encodings would quite slow down the spam robot; as the majority of email addresses published on the Internet is in plaintext format, from the spammer's point of view it's simply not worth the trouble.

A research report ("A CDT Report on Origins of Spam") published in 2003 by the U.S. Center for Democracy and Technology confirmed at that time that this technique was effective.

Testing

To verify the effectiveness of HTML munging, I ran an informal empiric test over 15 years.

Two email accounts (mailing lists) were opened on the Yahoo! Groups platform and published here (on November 8, 2004):
emailmungingtestrec_plain@yahoogroups.com
emailmungingtestrec_munged@yahoogroups.com
Although they look the same, the first one is in plaintext and the second one is obfuscated by munging; you can check that easily via the command View Page Source of your browser. The two login names were chosen long enough so the odds that they could be found by a generator of random email addresses (a common spammer tool) were low.

No message was ever sent from these addresses. However, just because they were publicly visible on the World Wide Web, they started (on November 27, 2004) receiving spam.

Here are the figures of the received spam per month over the whole 15-year period, from 2005 to 2019. (The test ended on December 2019 following the changes on the Yahoo! Groups platform.)
The spam received by the plaintext address is marked in red, and the spam received by the obfuscated address is marked in green:

Click on the image to view the full-size version. Monthly figures are available in this PDF.

The test showed that munging reduces the amount of spam mail received by a great amount -- from 90% up to 100% yearly.

The tool

Nowadays there are several websites devoted to email munging (much more complete and detailed than this page) and offering scripts that automatically obfuscate an email address.
However, when I first wrote this web page in 2000, there were none; therefore, I came out with this Email Munger script (first done as a Java applet, and later in JavaScript). To use it, go to the top of the page and:
  1. Enter your email address in the field, then click the Mung button. If the Link checkbox is selected, a mailto link to your email address will be included within the code.
  2. Select and copy the code that appears in the field below: this is your munged email address.
  3. Paste the code in the HTML source of your webpages, wherever it is needed.

This script runs client-side; all it does is to convert your email address on your local machine. Your email address is not transmitted to anyone, recorded, or used in any other way.

This tool is released as Open Source under the GNU General Public License, and is available as a JavaScript snippet (the same included at the top of this page), a Java applet, and a Perl script; you can download it from here. A Python script is also available.
To run the tool on your local machine or your web site, unzip the archive in a directory of your choice and use the HTML files included.

Acknowledgements

This page was inspired by Michael Fleming's webpage MAILTO: Munging (dead link).

The Email Munger has been cited by: "L'acchiappavirus" by Paolo Attivissimo (pag. 189), WebJuice, Web Design Guide, The Center For Civic Engagement at UTB/TSC, Nadeau software consulting, the documentation for the mungeMailAddress() function in the Seagull PHP framework, IS Web Designs, and in the University of East Anglia IT Faqs.





by Daniele Raffo         page created on 17 July 2000         page updated on 30 January 2020