Wednesday, January 18, 2012

A Coincidental Follow-up: How Memes Propagate Through the Internet

Just two decades ago, prior to the internet, to find out the results of a presidential debate one had to either watch the debate live, hear about it from a friend, learn the results through a TV or radio report, or read about it in the newspaper the following day. Due to the smaller number of news sources, it was easier to dispel rumors and trace back any misquotes to their sources origins. Now, however, the social media landscape allows for instant broad dissemination of information from a debate. Particularly controversial statements will find their way through the intestines of the internet even faster, as people quickly express their amazement, awe and criticism through blogging. Throughout this process of posts and re-posts, the an original quotation is bound to morph as people occasionally misquote it. Tracing the evolution of a quotation throughout the internet sheds light on the structure and inter-connectivity of news websites, blogs, and social media sites. In their 2009 paper “Meme-tracking and the Dynamic of the News Cycle”, a team at Cornell University headed by Jure Leskovec developed a method for examining the path of these “memes”.

The researchers employed a novel method to track the spread of memes throughout a set of 90 million news articles and blog posts from August to November 2008. They first identified very short phrases that remain intact as they spread. This set of distinct phrases varied heavily from day to day, and was a good indicator of memes that could be traced. They then developed algorithms that were able to extract all variants of these phrases. The algorithms involved developing phrase graphs, which are directed graphs that have phrases as nodes and directed edges from one node to another if the first phrase was a direct predecessor to the second. A primary way of determining whether to place a directed edge between two nodes is if one is an excerpt of the other. The phrase was then partitioned to create connected components with a single root node that represented the origin of the meme.

When analyzing their results, Lescovec et al. were able to make some interesting conclusions about the dynamics of spread of information. There were distinctive peaks in the number of mentions of a meme on both blog and news sites. However, the peak attention from news sources occurred around two and a half ours hours before the peak attention of that phrase from bloggers. On either side of these peaks, the volume of intensity declined exponentially. Smaller peaks also showed oscillation between blogs and news sites, as the two genres of sources played off each other in mentioning a phrase.

Now, with another election season upon us, political candidates are again in the spotlight. Their statements will be spread throughout the internet through news articles, blog posts, Facebook statuses, and Twitter updates. In the past four years, the Internet has become more connected, as these websites have grown in popularity and more of the campaign trail is captured online. What would a similar meme analysis of a political phrase look like today?

