Wednesday, December 10, 2008

How tags compress semantics

I just had a simple, yet (personally) powerful revelation—a moment of grokness, if you will. While searching through my Delicious account for a bookmark to a TED talk to link to in another blog post, I came face to face with a predicament that made me really stop and think.

I began my search by using the tag "ted", with which I've tagged all TED talks I've bookmarked. I have 78 TED talks bookmarked. The bookmark entries for these posts have 280 distinct tags, 1024 words total in their bookmark title fields, and 1668 words total in the comment fields. The predicament is, do I look through 78 posts to find the one of interest, or do I instead look through the 280 tags?Tag

Or is it? If we rephrase the question, we see I'm really asking, "Can I find what I'm looking for faster using 280 words, or 2692 words?" See, those 280 tag words actually represent a compression of the semantics (the meanings) of the 2692 descriptive words. I can quickly scan 280 tags to identify the closest to my concept, giving me a significantly more manageable subset of posts to scan in more detail.

Tags seemed very straightforward and powerful before, for example, reading Clay Shirky's article on the power of tagging, but it took this moment to really understand the power behind them, much like the "A ha!" moment of seeing a binary search when you've always thought of search as linear.

Two side notes:

  • I'd like to thank the developers of pydelicious for providing me the software to extract those statistics about my Delicious tags.
  • It turns out the video I was looking for had the clip of interest removed due to copyright permissions, and so the real answer to the question was to Google it. Still, it was worth it for the thought.

No comments:

Post a Comment