SML Search

Saturday, January 5, 2008

Flickr Analytics: The making of interestingness / SML Analytics

Dangeroos / 2002 / SML Graphic Design (by See-ming Lee 李思明 SML)

As part of an ongoing effort to document myself to better understand myself, I posted this design concept I created for Eric Roos' music business Dangeroos onto Flickr in August 2007.

The piece was designed back in 2002, and was a quick sketch or mood board for the Dangeroos Web site. The identity Dangeroos is a play on Eric's last name, so the choice for Interstate (the typeface) was obvious. Red was chosen because it suggests danger. Wave morphs, color rectangles and circular compositions dropped in to support the idea of sound.

Shortly after I started a project called Flickr Analytics to analyze Flickr's interestingness algorithm. Because of that, I become fairly aware of the individual image's ranking over time. To my surprise, this design stays consistently within the top 20 most interest images, and over the past few months, reigned as the number one most interesting image on my entire Flickr stream.

What was even more interesting to me is that the image also drives much traffic: more than 4000 views within the last 5 months, which amounts to 25+ views a day. That's a lot for an image with no human element.

So when Flickr Stats launched on 2007-12-17, it was a God send, for it enables me to analyze traffic and get a better understanding of where traffic is coming from.

How to get to Flickr Stats

Flickr UI: Additional Information / 2008-01-04 / SML Screenshtos (by See-ming Lee 李思明 SML)

To the bottom right corner, below the list of tags of an image on Flickr is what I would call the utility area of the page. This is also where you can access the Photo stats of the image in question. Clicking on the link will bring to a page similar to the one below:

Photo Stats

Flickr Stats for: Dangeroos, 2002, SML Graphic Design / 2008-01-04 / SML Data (by See-ming Lee 李思明 SML)

At this time, Flickr Stats only allows you to view detailed traffic information from the last 28 days, which is not as flexible as most site analytics tool, but is still much better than none at all.

The first time I saw this page I was stunned. Previously I had guessed that the reason why the Dangeroos design was popular had to do with that it was the first image on my 100 Most Interesting Design (set), which is why I had kept it on my Flickr homepage all these time. Data suggests otherwise. In fact, 54% of its traffic (1,041 visits) came from images.search.yahoo.com and 30% (572 visits) came from flickr.com.

Really? What were people searching for on Yahoo? I clicked on the domain name and get to the referrer detail page for the Yahoo Image Search:

Flickr Stats: Referrers for: Dangeroos, 2002, SML Graphic Design / 2008-01-04 / SML Data (by See-ming Lee 李思明 SML)

Apparently, I'm getting a lot of hits from Yahoo from people search for graphic design. FlickrStats has a nice feature which allows you to click on the keywords to go directly to the search results in question. This is where I noted that apparently back in 2007-12-18, searching for graphic design on Yahoo put this piece on the first 5 results:

Yahoo Image Search: Graphic Design / 2007-12-18 / SML Screenshots (by See-ming Lee 李思明 SML)

Image Search Algorithm

If you think about how difficult it is to develop a useful text search algorithm, you can image how challenging it must be to create a good image search algorithm. Indeed, until most recently, Google Image Search relies on the image's file name alone to feed you results.

Aside from gaining a huge user base, Yahoo's decision to buy Flickr is obvious: image tagging data.

Tagging is a voluntary act by the user: creating meta data to organize his collection fo photo much easier. To the search engine, however, tagging is free metadata association. One strategy is deciphering whether the tags are accurate can go like this: each time someone search for a search term, say graphic design, my search engine will throw 20 images associated with that tag on the search results page. An image that's more related to that search term will more likely to be clicked on by someone searching for that result.

With time and patience, it would be possible for me to figure out which tags are valid and which tags are not. People who search for the search term and then either favorited or commented on that photo would mean that image is more relevant (and thus "interesting") to that search term.

The same strategy can be applied to Flickr Groups. When you post an image to a particular group, the group usually is associated with certain keywords. When users click on an image among all others, they are functioning as bots with very advanced algorithms to do things that machine cannot yet do: identify the good images from the rest. I call humans participating in these activities BioBots.

The key in these systems is to identify the experts. Once you have collected enough data on a user and noticed, for example, that they have a degree in graphic design, working in the graphic design field, and perhaps are members of mostly graphic design groups, it may be fair to say that their opinions on graphic design matter more. You can thus put in your algorithm to give their opinions more weight for the same reason why KOL (key opinion leaders) in pharma talk has their role in medical sites.

©2008 See-ming Lee 李思明 SML / SML Pro Blog / SML Universe. All rights reserved.

No comments:

Post a Comment