Home

Advertisement

Customize

Jun. 25th, 2008

From collaborative filtering to collaborative ranking Or Amazon wish list 2.0

I rarely lay my eyes on mediocre content anymore. Collaborative filtering has become very good but too much content come out. The filter is not deficient or the threshold too low. No, there is simply too much good content that deserve to be read, a blessing and a curse.

We need tool to help us manage this stream of good data. First locally with local queue offering ease of ordering and organizing (tagging). Then harvest the collective intellect to help us.

We need to combine the collective filtering (reddit and digg) and the collective ranking (google).

I use the example of books because it seem a simple example (away from too many technical terms) and also the time cost of reading a book is order of magnitude higher than for an article.
You may not have wasted your time with a good book but you could have made better use of it reading a better book for you in that point in time.


The problem: My Amazon wish list is exploding. I want the list to go
from helping me remembering what I want to read to selecting which book
I should pick next.

In formal terms: I want to go from an unsorted set to a priority
list. The problem becomes what ordering function I should use ?

A solution: Using social networks and other web 2.0 ideas could be
very helpful.

My fundamental misfortune is that I am extremely curious and reading
books is the principal way (closely rivaled by web articles) by which I
satisfy that need. Also a large part of my pure entertainment comes
from reading fiction.

Those desires compound themselves faster than my ability to read, the net effect is an always increasing reading list. How
bad is it? You cago
see yourself
384 books over 16 pages. Plus a dozen books on my
book shelves :

Reading list

The solution is not to find more time for reading, I read all day and
watch almost no TV. I could even say that the issue is related to
the best use of the short time I have on this earth. A fundamental
dilemma that could sure use some help.

I used to browse through the list from time to time and select what book I
wanted to read next. But now it is almost too overwhelming.
My solution as a geek is to look for a better tool (not to clean up
the list).

Amazon focuses on helping you add items to your list. This probably makes
perfect sense for them because they are helping their customers and the
pathologically curious are, perhaps, not a big market for them. But I would
love to see the data, to be sure. Notably, the distribution of wish list size and how size correlates to sales.

Netflix does not really have a better approach; your wish list is
ordered for logistical reasons but by the customer and the end result is
not much better once your queue size explodes.

There is a very similar problem that is currently being worked on: reading web articles with a user rated aggregate. And I think a
similar technique could apply.

First some very simple additions to Amazon's wish list:
(a) TAGS
obvious
(b) Search box
idem

The principal difference is that the problem is orders of magnitude
smaller and less time critical. So we have more processing power at
our disposal.

So here are my suggestions for the ranking function:


(1) Multiple hits by one's self
What mostly ends up determining which book I buy is how many positive references I see. I might read a review positively recommending it and, since I am interested in the subject, I add it. But the buying decisions will mostly come if I get multiple references.
Every time you see a positive reference you can click on a reddit-like up arrow (or down for negatives). Will also happen when you try
to add the same book twice.


The system should make it as easy as possible for you to add this
input. Not only when you are connected but even more importantly off
line.

(c) Easy on/off line addition
Being able to use your cell phone, text message and even recognition of bar codes
from pictures.


For the other criteria we need external information and the most
natural source is peoples' ratings of books they've read. It also makes sense that
the to-read and already-read list be integrated.

(2) The crowd at large
How popular is a book? Harder than it
seems. The book needs to be compared to other similar books (a
hard pb in itself) but also take into account the general popularity.
There is also the difficulty of determining the difference between bad and controversial/not for everybody.

(3) Friend and friend of friend
But actually not really. Your friends are usually not a perfect match of your taste and interests (not a bad thing at all). There is sure to be some overlap.

(4) Recommendation
Your friends may not exactly share your taste but they know
you. They should be able to recommend books specifically for
you.
An example, you did not really like that book but you think that K will love it. Or, this
is a very good introduction, the materials were way too easy for you and you
wasted your time but P was looking for an introductory book.

(5) Semantic graphs
Other people that may share your interests but
are not your real-life friends. They are probably a better predictor of your
taste. Of course it is a hard problem to identify the relevant subgroups.

Bonus : Time clue
My interests go through cycles, it would be great if the list could take advantage of it.


I am sure that everybody who tried to have an annotated and rated
mirror version of their library online discovered that it is tedious, even
with great tools like Delicious' iSight scanner abilities.

So the wish list needs to be useful even without the social part
and have enough value to make it worth it by itself. Hopefully items a, b and
c would do the trick. It could nicely bug you about books you
purchased after a while (bonus: figure out how long it will take you
to read it).

This service would have to be integrated into a lot of other services. It would
have to be a Facebook app, use Amazon web services etc...

How to monetize the service? Ads, percentage of sale from books, paid for
suggestions?

The perfect rating function could also be an excellent complement to
present /., reddit, dig and co. because there is an over abundance of
good articles I want to read. It is reaching beyond my quick
browsing abilities; I need a way to prioritize the.
Tags:

August 2009

S M T W T F S
      1
2345678
9101112131415
16171819202122
23242526272829
3031     

Links

Syndicate

RSS Atom
Powered by LiveJournal.com

Advertisement

Customize