Del.icio.us is Crack
March 15th, 2005It’s taken me a while to really figure it out, but you know what?
http://del.icio.us is huge.
I figure that anyone who happens upon this blog already knows what it is, but on the intensely outside chance that you haven’t, here’s an introduction.
I’ve been using del.icio.us for a long time — here’s my tag map doohickey provided by extisp.icio.us, and here’s my page on del.icio.us proper. A few stats: 1059 tags, 4695 posts.
So lately I’ve been thinking about what this thing really is. And when I thought about it, it occurred to me that it’s probably the biggest “classification” of anything I’ve ever made in my life.
Which is kind of funny, considering that I once had a job where I was officially a “technical lexicographer,” i.e., I was paid to classify terms in a hierarchy (it was for a natural language search engine — the product ended up being bought out by these guys).
What we did there all day long was decide which word should be a “subword” of which other word. to jaunt goes under to run. Except most of them weren’t that easy. Arguments were inevitable, and people would come up with pet plans for generalizing how to organize stuff (I did too). And we just weren’t fast, because we felt like we had to be right. The project didn’t start from scratch, it was based partially on Wordnet which is a hierarchical lexical database which has been around for a long time. There’s a web interface where you can take a look at the sort of data structures it contains… heh, I see that they don’t call jaunting a kind of running.
So they have their stats up. Here’s the bottom line: “The total of all unique noun, verb, adjective, and adverb strings is actually 144309.” Now, if I myself have 1K tags in del.icio.us, it’s blatantly obvious that del.icio.us is bigger than Wordnet: there are clearly more than 100 users of del.icio.us (geeze I hate typing that).
One could argue that it’s apples and oranges, Wordnet has nothing to do with applying tags to URLs. But at least 80 percent of what I did in my job had to do with nouns. We avoided off verbs as much as possible (as the example I mention shows, it’s really hard to classify verbs consistently), adjectives were just as difficult, and adverbs? Fuhgeddaboutit. If I recall correctly, the early versions of Wordnet didn’t even touch adverbs. (Hell, it’s hard to even define an adverb.)
Not to mention that those categories aren’t necessarily useful across languages.
I’m just thinking out loud here, but I think the scale of the success of delicious really makes one wonder about the wisdom of attempting to build a lexical database in any way that isn’t distributed. It’s Yahoo versus Google all over again. “Folksonomy” isn’t necessarily the most mellifluous, er, tag, for the tagging craze, but the tagging craze itself is definitely onto something.
Anyway, I’m babbling on. Joshua Schachter is a genius, and goodnight.