Reply to Craig Hubley (4, 5 & 6)

I don’t have any comments on Craig’s point (4) about search engines, since I don’t know how our work will affect them.

Regarding our software architecture, and how it ties into existing systems (as discussed in Craig’s points 5 and 6):

We very much agree with Craig’s statement: “Whatever architecture is chosen, it has to find the path of least resistance through [existing mechanisms] and be dead simple.” Our goal is to come up with a back end tagging library that can be shoehorned into the widest range of existing (and future) front ends with the minimum disruption. Each front end will have to provide some UI to allow users to apply tags to items, and we’ll have to have a SQL database to store our tag info. Other than that, we can have a very narrow relationship to the host environment.

We’re planning to design the specific embedding in consultation with our first few test sites. Of course to implement and test the library we’ll have to embed it in our own feed aggregation environment, but since we control all the pieces, that doesn’t raise the same architectual issues. However our initial experience with our own environment will give us a basis for collaborating on the design with our test sites.

Craig mentions the possibility of standard APIs or protocols for tag manipulation. We will probably get to these, but I’m most comfortable trying to write standard APIs once we’ve built a few diverse working implementations. Otherwise we’re trying to design based on our fantasies about how people will use this, which are likely to be wrong.

There is another domain where standarization will be even more important, but will probably also take longer. When multiple sites are using this kind of individual tagging, it will be very helpful for them to have a way to operate within a shared social landscape, let individuals extend their profile across sites, and so forth. Interaction between sites requires careful design, since it raises lots of issues about user privacy, security, control by each site of how much information it shares, etc. Also, the protocol for interaction has to be a standard, since it will be used by multiple different developers, and has to be implemented consistently to provide interoperability.

While this cross-site interaction has the potential to generate a great deal of value, we have to realistically defer this design until we have enough experience and enough different developers involved to get it right.

One Response to “Reply to Craig Hubley (4, 5 & 6)”

  1. April 9th, 2006 | 8:08 pm

    My view of software and protocols differs radically here. The toughest aspect of the problem has to be tackled first, not last. Tossing in vast complexes like SQL up front is silly.

    “We’ll have to have a SQL database to store our tag info”?!? Why? Who’s “we” here? This is a ridiculous amount of overhead for what is ultimately a simple tuple storage, or try storage, if done right. My jaw is in the floor. SQL has nothing to do with this problem. Is SQL required to implement RSS? Wiki? HTTP? If not, then, why would you need it to store tag data?

    “We can have a very narrow relationship to the host environment” without SQL.

    I also disagree that “to implement and test the library we’ll have to embed it in our own feed aggregation environment”, it doesn’t actually test anything interesting. Do it as a protocol and keep the coupling loose, I say. Treat this as a protocol on day one, not a vast tumour the size of an SQL database.

    It’s certainly true that when you “control all the pieces, that doesn’t raise the same architectual issues”, accordingly, it tells you literally nothing about how the thing would work in a loosely coupled web2 type environment.

    Given that, I don’t see how “initial experience with our own environment will give us a basis for collaborating on the design with our test sites.” I think the initial experience is just biasing your view of the problem, starting with the “SQL”.

    I think one starts with “standard APIs or protocols for tag manipulation” on day one. There are many “diverse working implementations” of commercial tagging services, categories, third-party tagging, bookmark sharing, etc. There’s enough. These are the constraints. There’s no issue of “fantasies about how people will use this, which are likely to be wrong” since they’re using Wikipedia, bookmarks/favourites, alexa.com technorati, deli.cio.us and eBay reputations already. These are all forms of tagging. If they are all to be supported in a common way, then the first implementation must try to do that.

    “When multiple sites are using this kind of individual tagging, it will be very helpful for them to have a way to operate within a shared social landscape” which means necessarily solving the technical problems so they can see the social ones.

    To “let individuals extend their profile across sites, and so forth” doesn’t require “careful design” since design can’t solve problems that weren’t solved at the architectural and ontological level. The “issues about user privacy, security, control by each site of how much information it shares, etc.” simply cannot be solved if the terminology isn’t correct from day one. For instance if you allow “name” to mean “real name” to mean “body name”, you’re sunk. If you allow “account” to mean “person” you’re sunk. If you allow “privacy” to mean “privacy from other users but not from systems administrators” you’re sunk. The Titanic has five chambers that can fill with water and she can still float. Fill six and she will sink for certain. In privacy issues I think the number may be just two.

    Since “the protocol for interaction has to be a standard, since it will be used by multiple different developers, and has to be implemented consistently to provide interoperability” there’s really no other place to start than the protocol between service providers. A personal tagging interface is just the minimum implementation, like a blog is the minimum RSS feeder.

    If this thing is to fit into a service oriented architecture then it has to think, breathe, and even stink like a REST protocol. Not like SQL.

Leave a reply