Continuing with my response to Craig’s comment, his third point:
Because they’re political, the influence of our politics on our tags probably has to be (if not explicit) easy to determine by correlation. Someone who’s using tags like “Peak Oil” or “culture of life” at all, is clearly part of some group or movement. Mention of “seal hunt” as a specific or distinct topic suggests concern with it as an issue. Varying terms like “monetary reform”, “capital base”, “reserve rules”, “Bretton Woods”, “dollar hegemony”, all suggest different angles on the same problem, some of which (”reform”) suggest action should be taken. The people tagging may NOT all want to find each other, but the reader DOES want to find all the angles on these issues and therefore would PREFER that tags aggregate in certain ways. For someone “on the left”, perhaps “abortion” aggregates with “women’s rights”, while “on the right” it aggregates with “culture of life”. There must be respect for these choices, and there must be ways to keep these aggregations (”redirects” in wiki-speak) under the control of the user, or a user-chosen, user-trusted, agent. At the highest level of abstraction, I’d simply choose metaphors I wished to reinforce or move towards, and those I wished to abandon, and let aggregation occur as a function of those choices. At least, it might decide which of a long list of hits to drop off, or set some ordering choices. That would be no more insidious than what google is now doing, for its own reasons (not mine).
This point raises a number of technical issues about our approach. They are worth discussing, but realistically, we don’t yet know how we’re going to handle them, so any response at this point is speculative. But hey, speculation is fun!
First, we definitely plan to give each user the ability to make their own individual decisions about how to aggregate issues. On the other hand note that that does not require them to make all the decisions themselves, we certainly plan to let them use the aggregation done by others.
Second, as far as possible we’d like the system to implicitly acquire each user’s current preferences, rather than making users “explain” what they want or why they want it. The approach we’ve adopted is to let users attach tags, and then try to learn the attributes of the content that are statistically common to the way that user uses a given tag. In some sense this should let us describe the user’s current “rule” for applying that tag. The “rule”, of course, can change gradually or abruptly over time, and we should be able to track those changes.
Now let’s consider how to achieve Craig’s design goals within this framework. First, we very likely can find the collection of people who have similar “rules”. For example, if one person uses the tag “Peak oil” and another uses the tag “Energy crisis”, and they have tagged different collections of articles for some reason, but their implicit “rules” are very similar, we can recognize that. Our ability to group people together doesn’t depend on how they spell their tags, or which specific items they have tagged, but on the statistical similarity of their tag use.
On the other hand, we don’t currently have any plans to analyze the terms used in the tags themselves. So if one person used the tag “women’s choice” and another used the tag “baby killers” but they had very similar patterns in using these tags, we couldn’t detect that they had opposite feelings about the material. We would just see them as having similar “rules”. I think current computational linguistics doesn’t give us any way analyze tag names accurately enough to avoid this limitation.
Because we start with the individual user I think we can have some confidence that how things are aggregated will remain under their control. Each user determines the interpretation of their tags. If others use tags that are spelled the same, that won’t change how your tags are interpreted. (Note that this is not at all true of existing shared tagging, which may lead to some confusion.)
The harder question in our approach is how to let users group together, share tags, and influence each other’s tagging “rules”. Because we can find users with similar tagging we can help them to group together if they want. At this point it is less clear how to show users the “landscape” of other users with similar tagging patterns, or how best to give them control over their connections to other users in that landscape. I think once we have the user base, tagging data, and technology to work on those questions, the really interesting part will begin.