SML Search

Showing posts with label data-mining. Show all posts
Showing posts with label data-mining. Show all posts

Saturday, August 22, 2009

Multiple Personas Disorder

No comments:

Personas is an art installation by Aaron Zinman. It reveals the inner workings of data-mining technologies, and outputs a visualization of your Internet personas based on your first and last name.

Many sources have pointed out that although the experience was great, the piece failed to deliver consistent results. What most people failed to note, however, is that inconsistency was exactly the message the creator tries to deliver: that although fortunes are sought through the indispensable use of data-mining, this kind of data is far from infallible.

The site reveals inconsistent results for me, too, but I decided to perform multiple samples of the results to see if I can see an underlying pattern - in other words, turning an inherent fault of the machine to useful and relatively accurate results with additional sampling. These are my results:


Multiple Personas Disorder / 2009 / SML (by See-ming Lee 李思明 SML)
large size / original size

As can be seen from the graph, although the results vary greatly, I can still observe patterns using the multiple results.

While the proportions of categories vary greatly, there is definitely patterns in the categories of online, books, sports, movies, social, and professional. I don't know how sports got such a big chunk in the graph, but my guess is that it must have to do with my skydiving videos which were very popular in YouTube a while back.

Things become a bit more interesting as I use a different variation of my name to perform the search. As my own experiments with Google has shown, Googling See-ming Lee vs Seeming Lee vs seeminglee can yield fairly different results. While the main hubs of my universe remain the same, the weaker nodes vary greatly. Until I managed to teach Google how to equate all my identities as the same (more on this at another time), my online presence are determined mostly by people's preference of how they wish to spell my name.

Here is Personas' output of Seeming Lee (my name with no hyphen) and it reveals a more complex person:

Multiple Personas Disorder / 2009 / SML (by See-ming Lee 李思明 SML)
large size / original size

I think that these results reveal a more complete picture of who I am. The interesting bit is my music, which was absent in my first attempt, becomes a much greater part in my overall persona makeup. This makes me wonder if it is best for me to go by Seeming Lee instead of See-ming Lee, but the fact that the compact form also introduced an illegal attribute (whatever that means!) also worries me a bit. It is useful to observe however that there are consistencies among all 10 searches: online, books, sports, social and professional, so I'll accept that as generally a good thing.

All in all, the piece is fun to play with, and I recommend that you check it out at http://personas.media.mit.edu/personasWeb.html.

Additional information can be found at http://personas.media.mit.edu/

Saturday, October 6, 2007

Is more better?

No comments:
Things should be made as simple as possible, but no simpler.
Albert Einstein

Is more better? Depends.

More data is better, when you have the means to adequately mine the data in usable terms. If you don’t have the means to mine the data, they possibly will end up being unusable.

But life, experience, and data-mining tools are all iterative. In other words, they will get better

There will be a time where you have discovered a way to get to those data efficiently.

When that day comes, you will want data. There’s no reason not to collect data now.

SML Copyright Notice
Copyright 2007 See-ming Lee 李思明 SML / SML Ideas / SML Universe. All rights reserved.