Tagging, experts and the foolishness of crowds 17:08 on Sunday

Pandora classifies music with keywords such as:

  • Electronica roots
  • Ambient soundscapes
  • Downtempo influences
  • Use of tonal harmonies
  • Use of trippy soundscapes

Call me a skeptic, but this very telling categorization would have never happened using “wisdom of the crowds” methods like tagging. Experts can simply categorize content in more meaningful ways than masses.

The Wisdom of Crowds
The Wisdom of Crowds, originally uploaded by Kim E. Leon.

The music experts at Pandora might have started with all kinds of keywords, but at the end they have restricted themselves to what they call genes. The genes are chosen to represent specific attributes of music without leaving room for the curly hair of the singer, the bands chart success, the personal tastes of the expert, well accepted genre boundaries, or other non-auditory things to affect the assignment of keywords or tags that Pandora will rely on when finding new music for its users. It’s the power of professional editing applied to tags.

Problems with experts I can think of:

  • Experts rarely work without monetary benefit. Crowds do.
  • Experts can also categorize content in ways that are not meaningful at all to the masses. Sometimes (too often?) experts can’t explain their expertise in any sort of useful manner.

6 Responses to “Tagging, experts and the foolishness of crowds”

    Links from my other posts:

  1. /personal » Blog Archive » Social networking and shopping for music

  3. v Says:

    The more sophisticated the classification system is, the more rigid it is. Ie. the harder it becomes to change it or extend it when the corpus which is being classified, changes, scales up,

    In the case of Pandora, it might work, as music is a pretty well-defined “box”, which is thus easy to label. Also, in the case of Pandora, it is worth to note that that kind of classification works only in a system like pandora: would you be able to categorize music which you produce yourself like that or does it require the 3rd party?

    So, who’s the expert who can categorize the specific content? My personal paper drafts are classified by me and the system works well enough. In that case, I am the expert. In a closed system like Pandora, the classification works partly, because it is a closed system.

  4. Tim Says:

    Great post. I agree there may be some problems with the approach that Pandora uses to recommend music, but I think the benefits far outweigh the problems. I would rather hear what the experts think since I can rarely get what I’m looking for when I trust the masses. If you’re looking for a community to discuss these ideas I’d like to be able to recommend this forum: http://pandorastations.com/forum. We’re sharing our Pandora stations and ideas.

    Thanks, Tim

  5. sig Says:

    What about… you have two parameters: “expert tags” (very precisely defined and very confusing if not understood) and “expertly use” of tags (precise use of the right tags).

    With masses you may fall into the trap of using/misusing “expert tags” (like the tags in your post) in a non “expertly” way and thus a clash. Precise word with iffy meaning (unless you’re trained in a “standard”) is worse than many iffy words that as a whole can make up some sense for the untrained.

    Use imprecise and “iffy” tags, allowing for use of multiple tags to overlap in any number to narrow down towards understanding/finding.

    That is what happens when you paint a “picture” with words, tell a story or three using different angles – then (almost) all can understand precisely what you mean… dawning upon them as they listen. But at the end the word themselves are iffy and daily words. You just need enough of them with some relationship as in a story.

  6. v Says:

    Freetext and natural language rules. :)

  7. Niko Says:

    v: you’re right, there is a scalability problem with the Pandora approach. Then again, it’s the experts who do the classifying, so the beneficiaries of the classification don’t mind.

    Sig: When the tag cloud gets bigger, the problem becomes: which tags should I pay attention to? And with iffy tags, the cloud does grow big quickly.

    So, I’ll expand on your fun painting analogy. :)

    In the Pandora example the experts have defined a few very “picturesque” tags — only a few of their tags or genes are needed to paint a good picture. These experts are like great painters who need only a few strokes to express their message.

    Non-experts paint with dirty finger paint without delicate strokes. They use a lot of tags, lots of iffy strokes to paint their picture. Whoever looks at a picture painted with these words must think harder to figure out what it represents.

    Herein lies the problem: in many cases “thinking harder” is not an option. People scan read. They focus on the first three words of headlines (even that’s pretty optimistic). We can’t ask them to look at a tag cloud of 50 tags and “figure it out”.