Wednesday, February 1, 2012

The Spread of a Quora Post

Edwin Chen, a data scientist at Twitter and former MIT student of math and linguistics, recently authored a blog post tracking the spread of a Quora post and analyzing the quantities and time distributions of "upvotes" that could be traced to topic followers versus followers of previous voters.The blog post is more or less a case study of one of his own posts on Quora which happens to be quite an outlier from a "typical" Quora post, as it has almost 200 upvotes. While this limits its usefulness for drawing general conclusions, it is certainly still an interesting demonstration both of some of the "universal" properties of networks that we have recently studied in class, and of the power of social networks in particular.

The first task he tackles is in creating graphs of relationships between both topics and users. The first graph is a graph of topics, with edges representing mutual follows. This graph, while small enough that some of these properties are not as meaningful as they might be in, say, the web graph, shows a single strongly-connected component with clearly high clustering (by observation; we don't have the raw data to compute this) and small diameter. The small size of this graph makes these properties less significant, but the graph does show something of particular interest when the author manually categorizes them and demonstrates the phenomenon of extremely high connectivity among certain clusters, with fewer edges connecting these to other clusters in order to arrive at the giant SCC.

Mutually Followed Topics Graph
The second graph shows users, with edges representing at least one of the users following the other. In this we see the same properties menioned above, but there are now many more nodes with degree one or two being connected to more central nodes. If we had the raw data to analyze, preferably in higher quantity, this observation would likely manifest itself in a heavy-tailed distribution consistent with the class's discussion of "universal" properties. The further important observation to make here is the prominence of "hubs," or users with high degree who are likely culprits in assisting the spread of the post. It's too bad the second graph is not represented as a directed graph, as that would allow us to draw more conclusions about in-degree versus out-degree as it pertains to these hubs and their relationships with the degree one nodes.
Followed User Graph
The author concludes by providing visualization and analysis of the actual social dynamics governing the spread of the post through the above-mapped networks. It quickly becomes apparent that while the topic graph may be important for helping define the structure of the user graph (i.e. a user follows another user because of their expertise on an interesting topic), the social network is much more important in determining the spread of this particular post. In particular, the author demonstrates both with a very well-crafted video visualization and with a time-based histogram of upvotes that certain hubs in the user graph quickly bring large numbers of votes, and that many of the users notified through the social network were not even following the initial topics. The number of upvotes closely following the posting of the answer, which would have immediately triggered the post's syndication to topic followers, pales in comparison to the upvotes following any of the social hubs.

One potential conclusion that could be drawn from this data might be that social networks are particularly powerful as compared with other kinds of networks. It is, of course, also possible that this is a particular phenomenon of the way Quora is architected (perhaps certain kinds of activity are more prominent on the feed or in notifications) or of this particular post and its voters. The author states some of these open questions at the end of the blog post, and given the general content of his blog over time, we may very well see answers to these and other questions regarding social network structure in the near future.


No comments:

Post a Comment