« America's record store, like it or not | Main | A 4th Long Tail business category: Tools »

July 27, 2005


John "Z-Bo" Zabroski

"Active filtering" might be a poor phrase choice. The operative word Active is potentially confusing because anyone can easily make the obvious literary arguement that anything that is a filter is a filter; a "device that removes something from whatever passes through it." Filters are there for action and, because of this, activity is closely associated with filters.

I understand that the etymology of "active filters" is maybe from the field of electrical engineering and is usually covered there under the topic of electronic filters as a noise reduction technique. Unless there is an intentional comparison to "Q" Factors, I think "Active filters" may be an unsmooth phrase to typify a type of filter as such.

Electrical engineering and electronic filters are also a unique form of filtering that is seperate from what you have listed. For instance, both MPEG Audio Layer-3 (MP3) and Ogg Vorbis (OGG) files are lossy compression for compressing audio data into storeable file formats. However, in the process of compression, MP3s throw out trebble noise while OGGs throw out bass noise. This manipulates the electronic filters used in acoustic engineering. The point I am making is that "Software" as a family of filters seems to exclude "Hardware" based upon how "Software" is generally defined. Of course, any electronics guru will understand that the hardware is both electrical components and embedded-systems software. At the same time, the guru knows that electrical engineering side-effects in the hardware can manipulate the presentation of the software. These side-effects are pronounced and powerfully demonstrated by the software itself; acoustics hardware engineered without sufficient bass boost will distort the quality in the presentation of OGG files. Likewise, insufficient trebble will distort the quality in the presentation of MP3 files.

Additionally, being part of a "family of filters" means very little, doesn't it? It doesn't seem as though you have them grouped by technique, although your initial grouping does suggest you might feel its incomplete as well. Grouping filters by technique would probably be the preferred way, on account of the point made earlier that a filter is a "device that removes something from whatever passes through it."

Another phrase I have to examine closely is "tastemakers." Do they really make the tastes or simply provide taste samples? I would think that the number of people who actually make new tastes is actually quite few for any given taste: for instance, how many people invented Ogg Vorbis audio data compression? A critic does not make the taste but rather aggregates it through narrowcasting and broadcasting. In broadcasting, there is a difference between creating a fad and creating the product that makes the fad possible.

Am I being too obtuse? I hope not. These comments are intended to provide pedagogical/andragogical issues for when you are writing your book for the greatest clarity.

John "Z-Bo" Zabroski


I think closer attention should be paid to Mark Sigal's complaint (emphasis mine):

John, I think my comment is somewhat in line with what you are saying. All the better if the filtering is "smart enough" to recognize past actions, frequencies of favorite spots, amplify the good stuff, filter out the noise and the like.

If anything, Amazon's "passive" filtering has gotten way too noisy for me. Rather than operating from a thesis that says there are three different ways we can filter for you. If you only use one of them, over time we won't bombard you with the others. Instead, Amazon keeps slinging all paths because in aggregate (of the network - but not ME) people use all three paths. My advocacy is just that if you really want to empower the customer, you need to provide them a means to actively grab control of the filters when they know they want THIS and not THAT. It's an AND not an OR. Make systems smarter but don't forget that to really delight the customer enable them to self serve the truly personal touch.

Mark is discussing a far more important issue than the classification of filters; he is mentioning the impact of filters. It is great to have a concept or a theory, or even an execution strategy. However, those things mean little if the impact is not discussed. I think Mark was suggesting the eventual desire from at least some consumers for sub-filtering or re-filtering and custom filtering. Let us address each of these desires seperately;

(1) Custom filtering comes from a desire, which I want to say is a desire for http://www.google.com/search?hl=en&lr=&safe=off&q=define%3A+censorship&btnG=Searchcensoring; "deleting parts of publications or correspondence or theatrical performances." This is different from normal filtering, because the intention is to block something. Adblock is a form of custom filtering (following emphasis mine): "Adblock allows the user to specify filters, which remove unwanted content based on the source-address." An entire source of content is being blocked at the individual user level.

(2) Sub-filtering or re-filtering comes from another (and perhaps different) desire, which I want to say is a desire for current filters to become easier for the user to use. I also suggest this desire comes primarily from "Experienced" usability zealots and not the "Beginner" user. Jef Raskin, the creator of the Apple Macintosh project and its interface, was quoted for an interview in Doctor Dobb's Journal, saying, "Imagine if every Thursday your shoes exploded if you tied them the usual way. This happens to us all the time with computers, and nobody thinks of complaining." I think most "Beginner" users have a hard enough time understanding how to get a product to work to know how to effectively critisize its design. Examples of sub-filtering are Google's personalized search and Yahoo's MyWeb 2.0. Re-filtering, which is something I cannot find a clear example of that happens day-to-day, would be something like Google's Re-indexing of its PageRank, which is simultaneously the greatest strength and weakness for any search engine optimization/marketing strategy.

Alexandre Rafalovitch


What about the 'clipping service' filters? This are targetting a specific user with a known need in mind, instead of some group activity.

The prime example here is Egosearch, where one puts his name into PubSub/Technorati/Google Alerts and get notified automatically of all the webpages and blog entries that mention the name. Book authors love this one.

In terms of consumption opportunities, I might be interested in an emerging topic that does not yet have a dedicated news gatherer (e.g. Machinima 9 months ago).

Clipping service allows me to track the happenings, comment on and contribute to them with ease, thus speeding up the growth of the niche until it catches up enough interest to be trackable via other means.

The focus here is how easy I can setup my personal filters. If it is hard or requires consious effort (e.g. weekly topic search), I probably will just chase other interesting items that are more established. But if it takes 1 minute to setup a filter, I might do it for multiple very obscure, very low volume interests.


Paul Morriss

last.fm falls under the ratings section above, but with a very low threshold for participation (or none if you choose not to) - you can choose to love, hate or skip a track.

As an aside, taxonomy is the oldest profession, not the one usually referred to. God's first job for Adam was to name the animals. To name them you need to differentiate species so that you give them different names.

chris anderson

[Mike Vicic (vicicm@prodigy.net) emailed an alternative taxonomy that he's allowed me to post here:]

The examples for your two categories (people, software) seem to focus on three types of data:
A. opinions/ratings;
B. behavior;
C. facts/features.

Furthermore, this data is either centralized (1) or distributed (2).

This setup gives a nice 3x2 matrix with the role of people and software
(actors) for different use cases in each box of the matrix.

A1. People provide opinions/ratings to a site A1. Software collects and presents (Netflix recs, eBay recs)

A2. People post reviews/opinions/ratings locally A2. Software finds, collects, anlayzes and presents (???)

B1. People use features (bookmark, search, buy, upload) at a site B1. Software collects, analyzes and presents (del.icio.us, google, Amazon recs, Audioscrobbler)

B2. People use features locally (bookmark, playlists, links, etc.) B2. Software finds, collects, analyzes and presents (some antivirus software, google, others?)

C1. People categorize, catalog and deconstruct facts and features at a site (imdb, wikipedia, tv.com).
C1. Software analyzes content and further deconstructs to find relationships & similarities that people did not (or cannot) find or use.

C2. People categorize, catalog and deconstruct facts and features locally.
C2. Software find, collects, anlayzes and presents (???)

Of course, this isn't quite that clean since _google_ is a combination of B1 (search eval at site) and B2 (links on individual pages). A single piece of software can span multiple boxes. But maybe it's that feature that made google so good at the start.

John "Z-Bo" Zabroski

A very impressive effort by Mike Vivic. I think it is more robust than the original, and obviously by the question marks "???" there are still some issues to be worked out.

As an aside, I was making my way through various chains of links in the User Comments section and was reading Paul Morriss's blog, when something caught my interest.

The Problem Living in Thames Valley, 2005 July 13, Paul Morriss

The problem: living in Thames Valley, as I do, we get London radio stations. The reception isn't brilliant though, because we're not their target audience. This is frustrating when trying to tune the bedside radio to London based stations.

Morriss shows some of the effects of an electronic filter and it's "Q" Factor.

Mike Vicic

Take a look at B2 in the above taxonomy. (The question marks for B2 and C2 were meant as placeholders for examples.) I recently read at Popgadget about StumbleUpon. StumbleUpon is software that apparently analyzes your bookmarks (behavior) and ratings (opinions) in a distributed sense. You use your own browser as you would normally--unlike del.icio.us, which requires that you visit a centralized site.

I'm surprised that StumbleUpon didn't use some other metric (time spent per visit, # of visits per week) instead of user rating so that the recommendations are completely based on behavior and are completely transparent to the user.

Steven Rich

"I’d rather read my articles at Klikhir.com because sophisticated filtering offers me products I am probably looking for - in the same page! That saves me time searching for trusted brands elsewhere."

Inspired by the teachings of Chris Anderson, Klikhir.com is a Web 2.0 recommendation filter that receives a daily article feed and combines each article with related (to what the user is reading) products and services, automatically. The articles themselves are the filter.

The article feeds arrive in emails at a rate of around 150-200 per day. Klikhir’s engine scans the email and extracts the subject, content, author, URL and date. Using sophisticated word processing Klikhir scans the content for classification keywords. Klikhir currently creates accurate keywords 93% of the time.

The keywords are required for Index Classification, Google Adsense, Commission Junction, Amazon, and eBay Web 2.0 Services. The combined web services are queried simultaneously using the keywords generated by Klikhir's articles. The result is a quality article of interest wrapped with products and services directly relating to what the user is reading - in the same web page.

In essence I trying to create a New Economy Factory that adds value to articles, by using them to filter affiliate products, on autopilot.

What do you think?

The comments to this entry are closed.


The Long Tail by Chris Anderson

Notes and sources for the book

FREE was available in all digital forms--ebook, web book, and audiobook--for free shortly after the hardcover was published on July 7th. The ebook and web book were free for a limited time and limited to certain geographic regions as determined by each national publisher; the unabridged MP3 audiobook (get zip file here) will remain free forever, available in all regions.

Order the hardcover now!