Bayesian filtering for blog posts 09:15 on Monday

As you might remember, my feed reader of choice lately has been Shrook. Despite it’s slowness and it’s ability to turn my laptop into a heating element hot enough to warm up a small apartment, I’ve welcomed the help of smart groups that use bayesian filtering to find the posts I want to read. That’s the theory, anyway.

In practice I’ve started to doubt how well the bayesian logic matches the fuzzy logic I use to pick up the posts I want to read. Without having much of a clue about the technical delicacies of the bayesian algorithm, I have a hunch it doesn’t work well for filtering these cases:

  • A post is too long: Even though the topic might be interesting, the post should be filtered because I will never have the time and even less the attention span to read it through
  • Satire: This single word describing otherwise topical content, and I don’t need to read it.
  • Speculation: There’s a time for rumors and speculation, but often it’s not the time I read my feeds.

Anyone have a better idea for a feed filter?

