Paying by choosing — learning user interests to enhance site value

In his blog EconoMeta Adam Marsh examines the “deal” between websites and readers — content is free and advertising funds the site. Marsh points out that selling advertising depends on accurately targeting an audience, and an audience forms when content serves the interests of readers. Thus advertising and serving content to readers have a tight bidirectional relationship. That relationship depends, in both directions, upon knowing audience interest.

Audience interest can be approached from two ends: group and individual. Google starts with the group; it sorts search hits by aggregating everyones’ opinions (as expressed by their links). At Peerworks we are working from the opposite end, developing technology to help each individual teach the system what he or she is personally interested in. We do this by aggregating information about what a reader has found interesting in the past.

Knowing a reader’s preferences lets a site show them interesting content, but it also lets the site show them more precisely targeted ads, presumably generating more income and quite possibly making the user happier.

A more profitable site can offer better content. But for this circle to be virtuous, a website must provide readers with ways to understand and control the information that is collected about their interests, and must assure them that information cannot be shared or sold without their permission.

Brief status

Peerworks will build an automated tagging engine — portable and open source — that can be used by existing systems such as Scoop, Slashcode, Drupal and Plone, but also form the basis for new systems.

We’ve developed a custom feed aggregator for testing, and have collected lots of feeds and feed items for moderation. We are now working on the moderation UI. This will be used by a team of moderators to create a stable body of tagged items to use for testing and tuning individual tagging algorithms under development.

Initially we expect to use an algorithm very close to the SpamBayes classifier. There are existing examples of how to do this; Ben Kamens’ analysis is helpful, and Laird Breyer’s library dbacl uses Bayesian classifiers to assign tags.

Once our system can learn individual tagging styles well enough to make users comfortable, we’ll be looking for existing sites that would like to integrate it. Our further development will be guided by the needs of these initial partner sites.

If you’d like to know more, leave a comment with your questions or drop me a line: jed (at) peerworks.org

« Previous Page