Tuesday, March 6, 2012

Facebook's killing Feature, 'You May Know...'

Most, if not all, of the students in this class must know about one of the Facebook's killing features, 'You May Know..' This feature suggests a list of people to become friends with. I personally thought that the feature was indeed quite accurate, and wanted to know how Facebook determines which people to suggest. It was not so hard to find a research article written by a member from Facebook describing the algorithm behind the scene, and here I share it. Please, follow the attached link to find the original article. http://dl.acm.org/citation.cfm?id=1935914

Recommending a friend for a user had been studied even before Facebook's paper was published. One such example is “Make New Friends, but Keep the Old” – Recommending People on Social Networking Sites, which can be found at http://dl.acm.org/citation.cfm?id=1518735

 We find that the methods used in both of the papers share two main themes: 1) recommendation based on personal information/interactions, such as the field of interest in his/her profile, the types of postings, the frequency of chat between the two, and etc, and the 2) social network structure centered at the user to suggest friends. In Facebook's paper, these two notions are naturally implemented in a simple, beautiful way. The key idea is well explained in the top right corner in the page 3 of the original paper, which I repeat here in my own words.


The algorithm is called ''Supervised Random Walks,'' and the basic idea is to construct the measure of closeness for each person, who is not yet a friend, by randomly traveling in the friendship network beginning from the person to whom Facebook will suggest friends. This process is the same as what we have learned in calculating PageRank for a given web structure. In fact, as we were finding pages that were more important than the others, this random walking process provides us which person is more important/close to the origin user. On top of this idea, Facebook adds more sophisticated approach. That is, rather than assuming equal probabilities for all links going out from a node, the algorithm assigns different weights to each connected link, and the weights are determined by the personal information/interaction of the two users. In practice, the function that gives the weight from the personal information/interaction can be learned by previous data.


http://dl.acm.org/citation.cfm?id=1935914
http://dl.acm.org/citation.cfm?id=1518735

No comments:

Post a Comment