(keitai-l) Re: iAppli IMAP proxying service?

From: Curt Sampson <cjs_at_cynic.net>
Date: 07/08/03
Message-ID: <Pine.NEB.4.51.0307081400120.14418@angelic-vtfw.cvpn.cynic.net>
On Tue, 8 Jul 2003, Juergen Specht wrote:

> Today I agree with Curt that the Bayesian method is the most
> effective, but about 3 weeks ago I saw the first very interesting
> attempts of flooding Bayesian filters with a 90% to 10% ratio of
> harmless words like "family, child, kindergarten, etc"....

Getting really off-topic now, but this is not going to help the spam
get through at all, and will, if your filter is continuously training,
even decrease the chance that the spam will get through.

A Bayesian filter scans all of the words in the e-mail and picks out
only the most "interesting" ones, ignoring the rest. The interesting
words are those that are highly likely to mark a message as spam or
not-spam. So if you don't normally using the word "kindergarden," and
have never seen it in spam, it's going to be a fairly neutral word, and
will probably be ignored.

Worse yet for the spammer, if you're teaching the filter on a continuous
basis, the message, once marked as spam, will have all of those other
innocent words added to the database as having appeared in a spam.
If you never use the word "kindergarden" in your normal day-to-day
correspondence, and it starts appearing in spam with any frequency, it's
going to be turned into a spam marker.

But this of course also demonstrates why it's so important to train your
filter, and why different people need to use different filters.

Now this is an area where Docomo has enough control over the system that
they could do this sort of filtering quite well. All you'd need to do
would be to add a "this is spam / not spam" button to the interface,
people could train it for a month or two (or a week or two, for those
who send a lot of e-mail) and then start blocking the messages that are
obviously spam. This is one of the nice things about this filtering,
too; it's easy to set up so that the false positive rate is extremely
low, once you've done enough training.

But this would be a big project, and require lots of customization
of mail servers and so on, so I don't find it particularly likely to
happen. Given that, and that there's nothing you can do once your e-mail
address has been "discovered" by spammers, I think your only real hope is
to do your spam filtering somewhere where you can react a lot of faster
than Docomo can.

The unfortunate side effect of this, of course, is that some content
providers make it difficult or impossible to use a non-docomo address
when you want to use their services. (Not mentioning any names here!)
I find this particularly odd because these providers end up hurting
themselves when their mail stops going through because they're not using
my correct, guaranteed-to-be-working e-mail address.

cjs
-- 
Curt Sampson  <cjs_at_cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC
Received on Tue Jul 8 08:35:54 2003