Mandy, my wife, started a new blog, Mandy’s Minutes. Welcome to the blogsphere, honey!
I’ve been using Google’s Gmail service since late April. In that time I’ve come to really like it. [I have also accumulated 10 invitations to the service; contact me at kjdyck at gmail.com if you’d like one; first-come first-serve.] I like that I can keep all my messages forever. I like that I can search easily. I like the flat structure and the idea of tagging messages with labels. There is only one feature that I can imagine that would remarkably improve my experience with Gmail.
Perhaps I’m unusual in the way that I use email, but I am often lazy about tagging my messages with labels. I have a few rules set up to handle some of the braindead cases, but most of my messages arrive in my inbox without labels and remain untagged until my inbox grows so large that I feel compelled to archive its contents. Before I archive all these messages, though, I like to tag them so I can later view them with other messages in the same category. Depending on the number of messages in my inbox, this tagging step can take 10-30 minutes every other month, or so. If there were some way to automate this process, I could use this time for something more productive.
Since Graham wrote his Plan for Spam, Bayesian filtering has quickly become the standard method of classifying messages as spam or ham. The same technique could be used to determine whether or not to tag a message with a label. Voila! Intelligent automated tagging, the end of manual tagging.
This is such an obvious extension to Graham’s work that I’m somewhat surpised that it hasn’t already been done (at Google, or as a plug-in for popular email clients like Outlook or Notes). With all the smart people at Google, I would assume that somebody there has thought of automating tagging with bayesian classifiers. Perhaps it requires too much computation time or storage space. Perhaps there are user interface problems that I haven’t considered. Perhaps the differences in non-spam categories are too subtle for bayesian classification to do a good job. Whatever the reason, I look forward to the day where my email messages are classified automatically.