In his latest column on ClickZ, Stefan Pollard explains how a panel of humans tell Microsoft whether they believe an email is spam or not:
We usually think of spam filtering as a highly automated function that fends off the millions of spam messages trying to force their way into inboxes. Yet even the most sophisticated spam filters have a human component, often a network of e-mail users who click the "this is spam" button in their e-mail interfaces or vote on a message's spamminess through services such as Cloudmark.
Microsoft's Hotmail takes the people factor one step further with a little-known but highly valued panel of humans who tell Microsoft whether they believe an e-mail message is spam or not.
This panel's aggregated responses make up a data pool called the Windows Live Sender Reputation Data, which is folded into the decision-making process to better classify more e-mail messages correctly.
A bit more information about the panel:
- Members are active MSN Premium and Windows Live/MSN Hotmail customers who agreed to participate in the Feedback Loop Program after being contacted by the e-mail service.
- The program asks participants to rank a random piece of e-mail as "junk" or "not junk." This e-mail is a message that was addressed to them but that Microsoft plucked out of the stream and reassigned with subject line "Junk E-mail Classification." It could be spam, permission commercial e-mail, or personal e-mail.
- The users' feedback is aggregated into a giant pool of data and fed to reputation or spam-filter programs, such as Hotmail's SmartScreen and Return Path's Sender Score Certified program. It's then applied to automated e-mail programs to improve the application's ability to properly classify e-mail.
Sound like a panel you'd like to serve on? Unfortunately, you can't volunteer. You might get invited if you have a qualifying account for at least six months and respond to Microsoft's random invitation.
The feedback loop includes users in 200 countries, 60 percent of whom use a Hotmail interface in a language other than English. This diverse background, coupled with an invitation structure rather than a volunteer program, helps reduce pro- and anti-spam bias in the decision making. It's the same methodology pollsters use to find survey participants who represent the polling population as closely as possible, rather than rely on volunteers who may have a bias one way or another.
This human factor adds an important element to Hotmail's spam-scoring systems, which already include Sender ID for reputation scoring and IP reputation scoring, among other tactics.
Here's the impact Microsoft's panel could have on your own e-mail:
- How likely is it someone in this feedback loop ruled on a message you sent? Fairly likely, if you're a large-scale sender who sends regular e-mail.
- Remember, too, these are Hotmail's own customers reporting which e-mail messages they feel are spam and which aren't. If your permission e-mail message doesn't clearly convey that it belongs in their inboxes, they'll more likely classify it as junk mail. Are you doing all you can to demonstrate your trustworthiness and value by sending relevant, identifiable e-mail?
If you'd like to know more about Windows Live Sender Reputation Data or the feedback loop that generates it, see the Sender Score Certified Web site. This service, through Return Path, incorporates the data into its own reputation-scoring program.