John Graham-Cumming is about 666,666 clicks away from a new weapon that could help kill spam -- that's unsolicited e-mail,
not the salty canned meat -- for good.
Graham-Cumming, an Englishman who lives in Toulouse, France, is a seasoned spam fighter who wrote Popfile, an open source
e-mail classification tool. He also wrote Polymail, an antispam library licensed by other companies for use in spam filters.
Spam still comprises about 80% of all e-mail, although it has become less of an annoyance due to much-improved filtering.
But spammers persevere, finding technical ways of slipping e-mail through, and the race continues to develop sharper filters.
"I don't think spam is going to go away," Graham-Cumming said. "Clearly spammers are still making money or they wouldn't be
sending lots of spam."
Graham-Cumming's new project asks people to donate their time to classify a "corpus" of 100,000 e-mail messages used to test
the accuracy of spam filters. He's set up a site, www.spamorham.org, where people can randomly sort messages as either spam
or ham, which is good e-mail.
The e-mail messages comprise the Text Retrieval Conference 2005 Public Spam Corpus, affiliated with the U.S. National Institute
of Standards and Technology.
An unlikely major donor of the e-mail was Enron, the U.S. energy company whose errant accounting practices led to bankruptcy
in 2001. The e-mail of dozens of Enron employees was subpoenaed and eventually released to the public.
The Enron e-mail messages are a hot commodity for spam research -- a rich trove of private e-mail and spam that's hard to
come by, Graham-Cumming said.
The IDG News Service is a Network World affiliate.