Folks on Folksonomy

You'll recall that last week I promised to visit the following in more detail:

Elaine Peterson, Beneath the Metadata: Some Philosophical Problems with Folksonomy
David Weinberger, Beneath the Metadata, a Reply
Tom Vander Wal, Beneath the Metadata - Replies

Well, here I am. David and Tom have dealt pretty substantially with Peterson, and I haven't been following the discussions sparked by the pieces, so forgive me if I repeat someone else. I'm not going to write a full-scale essay here, but I'd like to make a few points.

First, a little context. One of my all-time favorite essays is Derrida's "Structure, Sign, and Play in the Human Sciences." It's one of the more accessible pieces of Derrida's writing, and tips the attentive reader off to a number of the themes that JD would revisit throughout his career. But it's always been a fave of mine because it lays out a particular rhetorical strategy that I've since seen repeated many, many times. Although it's not the sole focus of the essay, JD distinguishes between mythomorphic and epistemic discourses. I don't have the essay beside me, or I'd serve up some quotes. Mythomorphic discourse is fuzzy, messy, vague, imprecise, while epistemic discourse is much crisper, focused, organized. Think, as I think JD does, of the bricoleur and engineer, respectively.

The rhetorical strategy that the essay calls my attention to is a two-fold one. First, epistemic discourse emerges from the mythomorphic; one analogy I've always found helpful is the way that some slang eventually finds its way into "official" accounts of our language. The second move is that this latter discourse effectively seals itself off from its mythomorphic origins. You can see this 2-step as an early version of deconstructive reading, particularly of philosophical texts--much of JD's time is spent examining textual seams to find the messiness that has been disavowed by logocentrism. There is no small advantage for defenders of a system in disavowing its emergent origins--if a given system is simply "the way things are," rather than "the way things have become," then anyone seeking to change that system has that much more inertia to overcome.

I've spent a fair shake of time on this pattern because I think that Peterson's essay provides a fairly textbook example of the relationship that Derrida is working through in that essay. Like Vander Wal, I consider folksonomies and taxonomies "co-dependent" in that both are vital. But I think he underestimates the extent to which that position, which seems common sense to him (and me) is threatening to those who are consciously invested in taxonomy. It would be child's play to look at a given taxonomy, and to examine all the ways that it emerges folksonomically--Peterson's appeals to authorial intent ("...the goal is to recognize the author's intent over others' interpretations.") overlooks a vast network of classifications that emerge after "authorial intent" could play any sort of role. We don't categorize aesthetic movements, for example, prior to their instantiation by a given set of artists. But those categories ultimately become something akin to first principles, frames through which we understand the artists and works placed under its aegis, whether by "authors" themselves or by those who follow.

Peterson's essay, if it has one overarching blind spot, is that it cannot conceive of folksonomy in terms other than "A is not B," what she calls "the most important philosophical underpinning of traditional classification." And so, she doesn't see folksonomies and taxonomies in relation to one another; they are alternatives, from the first sentence introducing the former ("...folksonomy has emerged as an alternative to traditional classification."). All of the so-called weaknesses of folksonomy are weaknesses only if folksonomies are seen as an attempt to arrive at the goals of taxonomies through another means. Relativism is a Really Bad Idea when it comes to laying out a library, but pretty innocuous as an organizing principle for my home library, which tends to sort itself out according to how recently I've used a particular book.

I guess the point I'm working towards here is the assumption that Peterson uses to disavow the relationship of taxonomy to folksonomy: that because we can appeal to philosophical underpinnings when it comes to taxonomy, there must be corresponding underpinnings to folksonomy. The underpinnings of folksonomy, however, are rhetorical. Tags are about language-in-use, not about abstract definitional categories. They are addressed, even when the addressee is one's self at a later date. Folksonomy is bricolage, and so Peterson's conclusion that it makes for poor engineering is at once self-evident and a little inconsequential. Folksonomies are not "bad taxonomies"; rather, taxonomies are themselves folksonomies that have achieved a certain level of stability and intersubjectivity (this latter of which is mistaken by Peterson and others for objectivity). And part of the way that stability is achieved and defended is by denying the role that folksonomy plays in the origins of any taxonomy.

One more point, and I'll sign off. This is a point that I've been working at ever since my NFAIS talk last fall. There is no such thing as "search," at least not in the generic sense. The idea that all searches have the same premises and the same goals is mistaken. As you read Peterson, you'll see reference to categories, subject headings, search engines, etc. All of these references assumes a uniform model of search, one that I think of as "cold search," where you have nothing, and want something, and use the tools of taxonomy to locate it. (I think of this as the equivalent of cold call telemarketing.) And we search that way, sometimes. When we do, taxonomies are important.

But there's a different kind of search, which I'll call "social search," for wont of a better term (I'm open to suggestion). I also think of it as lateral search. I have something (some sources, a book, a favorite movie, song, whatever), and I want more of whatever I already have. So when I see a cover blurb on a novel that compares it favorably to something I've already read and liked, I'm more likely to buy it. When my friends with similar tastes recommend to me movies or music, I'm more likely to look into it. If I want more information about folksonomy, I can go to the Wikipedia entry on the subject, bookmark it in, and then trace out the network of others who have marked and tagged it. I don't need to start back at square one, break out Google, and try to narrow my search terms sufficiently.

The problem is that 90% (and maybe more) of discussions about "search" only think about cold searching. And honestly, folksonomies don't have much to contribute to the cold search, other than chaos. But for the stuff that matters for me, the culture I consume, 90% of my searches are lateral. My tastes aren't organized by the section headings in Barnes and Noble. The innovation of sites like Amazon, iTunes, and others is their ability to aggregate folksonomy (and yes, folksonomies are the Long Tail of classification) in productive ways beyond one's immediate social network. I never search Amazon using their taxonomies. I hardly ever find sources for my academic work by cold search. Most of my life is conducted easily and efficiently via folksonomies.

Damn. Every time I start an entry claiming that I won't write a full-scale essay, I write an entry that's far longer than normal. So I guess I'll stop here, and go work on other stuff. Although I will say that this has got me thinking about expanding this into a full-length essay. Not tonight, though, or even this semester.

Isn't data "beneath the metadata"?

For the most part, this is just a placeholder for links to Elaine Peterson's Beneath the Metadata: Some Philosophical Problems with Folksonomy, to David Weinberger's reply and Tom Vander Wal's reply as well. I've got some work and at least one meeting before I can turn to them, but turn to them I will. It shouldn't take too much sleuthing to figure out where I'm going to weigh in...

For example, trending

I've seen this link crop up in several places over the last week or two, but you never know who's seen it or not. So...

Chirag Mehta's Presidential tagclouds

Chirag Mehta has generated tagclouds for Presidential documents/speeches (State of the Union Addresses and others), going back to 1776, and offered them for your perusal. (Usability note: I found it a lot easier to use the arrow keys to move the slider back and forth one at a time, but didn't figure out that I could use them until the fourth time I visited the site...)

It's an incredibly cool project, and what we're doing over at the CCCOA is obviously related, although our output is structured in different ways. Although we've got other things occupying our front burners at the moment, this site has definitely got me thinking about how we might build on our work there. More broadly, and perhaps relevantly, it's also got me thinking about how visualizations of trends in language usage might be folded in to some of the work we do in our field.

So yesterday, Liz posted a request for discussion in anticipation of a session she's doing at a CSCW conference on folksonomies:

Is this a reasonable statement to make?
  • Tagging is the process of adding descriptive terms to an item, without the constraint of a controlled vocabulary
  • Folksonomy is the aggregation of tags from one or more users
Yes? No? Discuss.

I spent some of the day yesterday first figuring out what I thought, and second figuring out that it would be better expressed here than in the form of an overly long comment. And in the meantime, many of the things that I thought eventually appeared in the comments (of course). Even so, I thought I might wax a little old school here (since this conversation is actually almost 2 years old), and just write through some of the stuff I'm thinking about. I use the terms tagging and folksonomy without thinking about them much, and so reading through the comments thread and contemplating my own response has me teasing out my own terminology in a way that typically stalls me out --things get complex quickly enough to the point where I have to turn to other, more pressing demands on my time, and this has been much the same way. We'll see if I can place the pieces into something resembling a sensible order.

So, what is a folksonomy? One of the ways that I approached this question is to think about what folksonomy's other is. In other words, what is the term (or terms) or idea that folksonomy differs from? At first blush, that's an easy enough question--most descriptions of folksonomy make note of the fact that it blends folks and taxonomy, and the latter is its point of comparison. Taxonomies are top-down, folksonomies bottom-up. Taxonomies are consistent to the point of inflexibility, while folksonomies are fluid, dynamic, etc. So far, so good.

As I was browsing around (Weinberger, Vander Wal, et al.), though, I think one of the things that tripping me up is this apparently simple binary. Like DW, my assumption about folksonomy was that it referenced the order that emerges from the process of tagging ("I had been thinking that a folksonomy is one way order emerges from such set of tags"), as opposed to the kind of order that is imposed upon a set of data from above. In other words, I think that I've been assuming, ultimately, that taxonomies and folksonomies are two paths to the same endpoint.

But I think it's important (now) to note that the two paths lead to very different ends, a point that TVW's use of the term has slowly moved me towards. There's some overlap, to be sure. Taxonomies are forms of organization, but they are often also placed in service to other goals. DW has a nice riff on the Dewey Decimal System, for instance, that makes note of the ethnocentrisms and nationalism involved in the DDS, cultural asymmetries that are "necessary" because the DDS isn't meant as a mapping of all human knowledge but as a means to map out various libraries in this country. There is some practical value to spreading out call numbers evenly across the books on the shelves, even when those books reflect (and to be fair, construct) a partisan perspective. However, there's an extent to which taxonomies are (or strive for) an optimal representation of whatever data are being classified.

From a recent presentation, TVW explains that "People are not so much categorizing, as providing a means to connect items (placing hooks)," and that helps me get at what I'm looking for here. At the risk of trotting out another binary, there's a product/process difference here. Folksonomies involve order on at least two levels. One is the personal vocabulary of a given user; folksonomies are intrinsically social to the same degree that language is, but they need not involve multiple users (although they often do). This is TVW's distinction between broad and narrow folksonomies, I think. A narrow folksonomy is likely to express order only at this first level, in terms of a user's vocabulary. The second level of order comes from the broad, more social folksonomies, where many people are tagging the same object so that particular patterns emerge. For example, the top 10 or so tags for the article I just linked are folksonomy/folksonomies, tagging, tags,, flickr, metadata, socialsoftware, classification, taxonomy, and article. The tags across users (250 or so in this case) acquire a certain amount of stability.

These two "levels of order" I'm describing correspond, I think, to TVW's two "triads" (slide 32 of the presentation linked above)--one grounded in "identity" and the other in "community." The trick, I think, is that for taxonomies, there is no corresponding "identity triad," and so setting the two in simple binary form predisposes me to see only the second level of order as being "properly" described by folksonomy. And thus I was initially tempted to say that folksonomies are necessarily social (in the multiple user sense).

In a sense, then, I guess I'm working around to the idea that folksonomies are activities rather than systems or products, despite the fact that things like tagclouds, as visualizations of those activities, offer a static impression of those activities. A lot of the analysis that I see in TVW's work speaks of the trending that folksonomies allow, and that works for me with defining it as an activity--there's an extent to which a single tagcloud, produced in a given moment, only has value when it can be compared synchronically (to other clouds) or diachronically (to other versions of itself).

That's probably enough for now. I've got 3 or 4 wiggling threads that I want to pull on a bit more, but it's getting late. And before you say it in my comments, yes, I realize that I'm arriving here at ideas that have been circulating for more than a year. My point here hasn't been so much to say something new as to say it for/to myself.

