Thursday, January 19, 2012

Four Degrees of Facebook

“To demonstrate that people on Earth today are much closer than
ever, a member of the group suggested a test. He offered a bet that we
could name any person among earth’s one and a half billion inhabitants
and through at most five acquaintances, one of which he knew personally,
he could link to the chosen one”
-- from Frigyes Karinthy's "Chains" [1], 1929

It's quite amazing how in 1967, Stanley Milgram found evidence of the small world phenomenon---that people are connected via six degrees of separation. This was discovered in a world without Facebook, by having people passing a single postcard to someone they knew, who would then repeat the process until it got to a specific stock broker in Boston. Specifically Milgram told them,
"IF YOU DO NOT KNOW THE TARGET PERSON ON A PERSONAL BASIS, DO NOT TRY TO CONTACT HIM DIRECTLY. INSTEAD, MAIL THIS FOLDER (POSTCARDS AND ALL) TO A PERSONAL ACQUAINTANCE WHO IS MORE LIKELY THAN YOU TO KNOW THE TARGET PERSON.
You may send the folder to a friend, relative or acquaintance, but it must be someone you know on a first name basis." [1]
Most of us have probably been familiar with the small world phenomenon for a while, whether through Erdos numbers, six degrees of Kevin Bacon, or simply going to Disneyland enough times, that we've been convinced it's a small world after all. But imagine what it would have been like when Milgram first published his paper. In fact Milgram said he "asked a person of intelligence how many steps he thought it would take, and he said that it would require 100 intermediate persons, or more, to move from Nebraska to Sharon.” [1]

However, what's more surprising is that Milgram isn't the first to come up with this phenomenon. As seen from the epitaph above, Karinthy speculated it in 1929 (almost forty years before Milgram), nailing the exact number of people needed (either five or six), which is very close to Milgram's result that around 5.7 people were needed to link a randomly chosen person from Nebraska to a stock broker in Boston. But it's not even clear if Karinthy believed it was possible, but his fictional character did. Perhaps that is why Karinthy's prediction was completely ignored---it was fiction, and in fact, a short story that cannot readily be found today, at least in the English language.

Today, things are different. We don't need to make fictional characters to find out how many degrees of separation are between two people, nor do we need people mailing post cards. We just need enough people to join a social networking site, and good algorithms that extract shortest paths from such a site. Facebook recently published a paper that probably gives some of the most convincing data on six degrees of separation yet, so convincing that they've changed the name to "four degrees of separation." To be more precise, using a network with around 721 million users and 69 billion links, they've found that the average number of "intermediaries" needed to get from any Facebook user to any other is 3.74. [2] This is an important result, because up until now, very few large scale analyses of degrees of separation has been conducted, however we must analyze this result more carefully.

The first thing we must note is that Facebook is not the same as the world. This experiment is fundamentally different from Milgram's on many levels. First of all, Milgram did not use any algorithm to find the shortest path between two individuals, he had them find it themselves! Almost surely (not in the mathematical sense however) people did not actually find the shortest path, however they found pretty short paths. First of all, only 22% of the participants in Milgram's experiment actually found a path from themselves to the target (by mailing postcards). What's interesting is that the average path by people who lived in Boston had 4.4 intermediaries as opposed to 5.7 intermediaries from people living in Nebraska, who may have not known anyone in Boston. This is quite fascinating, since the difference is really only one person, so maybe people who live in Nebraska are very likely to know someone in Boston. Another interesting result that Milgram found was that stock brokers in Nebraska were about as far from the target (a stock broker) as random Nebraskans. This is quite surprising, and I don't expect the result to hold today, do to the vast amount of networking among people of similar professions, especially stock brokers. But again, the most surprising thing is that people can find a very short path (perhaps only using two more people than the shortest, given the Facebook result), simply by taking a local action (i.e. people only choose who to send their post card to without knowing anything about the structure of the network besides who they know, and who they think their friends might know). The Facebook paper remarks that recent studies have been conducted to test the route through social networks (as they did in the Milgram study) but they are still quite preliminary. Any results in how effective people are in routing would be very interesting. We had some experience with this, when trying to find the shortest path between two pages on Wikipedia in the first homework. I imagine people would be much better at finding the shortest path between two pages of Wikipedia than finding the shortest social path between themselves and some celebrity, since they know a lot more about the structure of the Wikipedia network, simply by having knowledge, than they do about the structure of a social network. Indeed, if we see the shortest path between Wikipedia pages, we are likely to recognize, most if not all of the links in many cases, whereas if we see the shortest path between us and someone else, we probably have no idea who many of the middlemen are. It would be very interesting to quantify this difference.

A second observation is that Facebook friends are not the same as real friends. In fact, one of the authors of our textbook Jon Kleinberg, remarked about the recent study that, "It’s the weak ties that make the world small," [3] But he also said, "News spreads well on weak ties. Those people I met on vacation, if they send me some cool news, I might send that to my friends. If they send me something about a protest movement, I might not." Perhaps it's not right to say that Facebook simply gives us fake connections, but rather it increases our connections. We might be on a first name basis with people today, whom if we met in the 1960s, we may have met but forgotten about soon after meeting them. I speculate Facebook makes us more social, in the sense that it increases our friendship degree, even if it makes our average friendship much weaker. One final thing to note is that there is no clear notion of a "Facebook friend." I may have less friends than someone else on Facebook, not because I am less social, but because I don't add everyone I've ever seen, let alone people on forums. I don't know how much this affects the results of the study, but I imagine it affects things like degree distribution in weird ways.

Finally, while the paper's focus is on degrees of separation, they present some other fascinating results. One interesting result is that Facebook's spid, or the ratio of the variance to the mean of its distance distribution, is 0.09, which is significantly smaller than the spid of the web graph, which is larger than 1 [2]. I am not sure what to draw from this, but it suggests there is some inherent difference between the distance distribution of the two networks. Another very interesting result is the percentage of reachable pairs in different subgraphs of Facebook. In 2007 in Italy only 0.04 percent of pairs were reachable, meaning the graph consisted of a lot of isolated components! But in 2008 25.54% were reachable. Likewise in Sweden, it went from 10.23% to 93.90% between the two years [2]. This shows how suddenly the properties of a graph can change, simply by adding more nodes, somewhat reminiscent of the thresholds we saw in class. The diameters of subgraphs of Facebook are also interesting. In the United States, the diameter is 30, and in the world, the diameter is 41 [2], which is still considerably small considering you might expect some people might only have a few friends on Facebook, who also only have a few friends, and so on until they get to the outside world.

In conclusion, it is clear that we are understanding a lot more about the distance between people in social networks due to huge networks like Facebook, and advanced algorithms that can deal with such data. However, there is a lot more to be said, and probably many more results we can get from data sets like the Facebook social network. For example, it might be interesting to find the distance of people on Facebook using only "close friends," which is presumably much larger. But regardless, I envision this study is simply a precursor to many more trying to analyze how close people really are, and how well people can find each other in a network each individual knows barely anything about.

References
[1] http://www.barabasilab.com/LinkedBook/chapters/3Ch_SixDegreesofSeparation.pdf
[2] http://arxiv.org/pdf/1111.4570v3
[3] http://www.nytimes.com/2011/11/22/technology/between-you-and-me-4-74-degrees.html?_r=1

2 comments:

  1. Very nicely researched post... Actually, we'll be talking about Milgram and lots of things related to this in Wednesday's lecture, so this is good timing!

    ReplyDelete
  2. Amazing post! I specifically like analysis of weak vs. strong ties because I wonder what types of networks have a lot of unconnected clusters (and if a strong ties only network is one of them). According to Wikipedia though (http://en.wikipedia.org/wiki/Interpersonal_ties), networks of strong ties may not even be as important as networks of weak ties. The article mentions the influential paper The Strength of Weak Ties , which talks about how people more often land jobs through acquaintances. If disease, news, and opportunities spread throughout weak ties, what is unique to strong ties? A "multiplexity" of interests? Beliefs? Goals?

    ReplyDelete