Virality Prediction and Community Structure in Social Networks
OPEN
SUBJECT AREAS:
COMPUTATIONAL
SCIENCE
STATISTICAL PHYSICS,
THERMODYNAMICS AND
NONLINEAR DYNAMICS
Virality Prediction and Community
Structure in Social Networks
Lilian Weng, Filippo Menczer & Yong-Yeol Ahn
Center for Complex Networks and Systems Research, School of Informatics and Computing, Indiana University, Bloomington, IN
47408, USA.
STATISTICS
INFORMATION THEORY AND
COMPUTATION
Received
24 April 2013
Accepted
5 August 2013
Published
28 August 2013
Correspondence and
requests for materials
should be addressed to
Y.-Y.A. (yyahn@
indiana.edu)
How does network structure affect diffusion? Recent studies suggest that the answer depends on the type of
contagion. Complex contagions, unlike infectious diseases (simple contagions), are affected by social
reinforcement and homophily. Hence, the spread within highly clustered communities is enhanced, while
diffusion across communities is hampered. A common hypothesis is that memes and behaviors are complex
contagions. We show that, while most memes indeed spread like complex contagions, a few viral memes
spread across many communities, like diseases. We demonstrate that the future popularity of a meme can be
predicted by quantifying its early spreading pattern in terms of community concentration. The more
communities a meme permeates, the more viral it is. We present a practical method to translate data about
community structure into predictive knowledge about what information will spread widely. This connection
contributes to our understanding in computational social science, social media analytics, and marketing
applications.
D
iseases, ideas, innovations, and behaviors spread through social networks1–12. With the availability of largescale, digitized data on social communication13,14, the study of diffusion of memes (units of transmissible
information) has become feasible recently15–18. The questions of how memes spread and which will go viral
have recently attracted much attention across disciplines, including marketing6,19, network science20,21, communication22, and social media analytics23–25. Network structure can greatly affect the spreading process15,26,27; for
example, infections with small spreading rate persist in scale-free networks8. Existing research has attempted to
characterize viral memes in terms of message content22, temporal variation16,24, influential users19,28, finite user
attention18,21, and local neighborhood structure10. Yet, what determines the success of a meme and how a meme
interacts with the underlying network structure is still elusive. A simple, popular approach in studying meme
diffusion is to consider memes as diseases and apply epidemic models3,4. However, recent studies demonstrate
that diseases and behaviors spread differently; they have therefore been referred to as simple versus complex
contagions, respectively9,29.
Here we propose that network communities30–32—strongly clustered groups of people—provide a unique
vantage point to the challenge of predicting viral memes. We show that (i) communities allow us to estimate
how much the spreading pattern of a meme deviates from that of infectious diseases; (ii) viral memes tend to
spread like epidemics; and finally (iii) we can predict the virality of memes based on early spreading patterns in
terms of community structure. We employ the popularity of a meme as an indicator of its virality; viral memes
appear in a large number of messages and are adopted by many people.
Community structure has been shown to affect information diffusion, including global cascades33,34, the speed
of propagation35, and the activity of individuals36,37. One straight-forward effect is that communities are thought
to be able to cripple the global spread because they act as traps for random flows35,36 (Fig. 1(A)). Yet, the causes and
consequences of the trapping effect have not been fully understood, particularly when structural trapping is
combined with two important phenomena: social reinforcement and homophily. Complex contagions are sensitive to social reinforcement: each additional exposure significantly increases the chance of adoption. Although the
notion is not new38, it was only recently confirmed in a controlled experiment9. A few concentrated adoptions
inside highly clustered communities can induce many multiple exposures (Fig. 1(B)). The adoption of memes
within communities may also be affected by homophily, according to which social relationships are more likely to
form between similar people39,40. Communities capture homophily as people sharing similar characteristics
naturally establish more edges among them. Thus we expect similar tastes among community members, making
people more susceptible to memes from peers in the same community (Fig. 1(C)). Straightforward examples of
homophilous communities are those formed around language or culture (Fig. 1(D,E)); people are much more
likely to propagate messages written in their mother tongue. Separating social contagion and homophily is
SCIENTIFIC REPORTS | 3 : 2522 | DOI: 10.1038/srep02522
1
www.nature.com/scientificreports
Figure 1 | The importance of community structure in the spreading of social contagions. (A) Structural trapping: dense communities with few outgoing
links naturally trap information flow. (B) Social reinforcement: people who have adopted a meme (black nodes) trigger multiple exposures to others (red
nodes). In the presence of high clustering, any additional adoption is likely to produce more multiple exposures than in the case of low clustering,
inducing cascades of additional adoptions. (C) Homophily: people in the same community (same color nodes) are more likely to be similar and to adopt
the same ideas. (D) Diffusion structure based on retweets among Twitter users sharing the hashtag #USA. Blue nodes represent English users and red
nodes are Arabic users. Node size and link weight are proportional to retweet activity. (E) Community structure among Twitter users sharing the hashtags
#BBC and #FoxNews. Blue nodes represent #BBC users, red nodes are #FoxNews users, and users who have used both hashtags are green. Node size is
proportional to usage (tweet) activity, links represent mutual following relations.
difficult41,42, and we interpret complex contagion broadly to include
homophily; we focus on how both social reinforcement and homophily effects collectively boost the trapping of memes within dense
communities, not on the distinctions between them.
To examine and quantify the spreading patterns of memes, we
analyze a dataset collected from Twitter, a micro-blogging platform
that allows millions of people to broadcast short messages (‘tweets’).
People can ‘follow’ others to receive their messages, forward
(‘retweet’ or ‘‘RT’’ in short) tweets to their own followers, or mention
(‘@’ in short) others in tweets. People often label tweets with topical
keywords (‘hashtags’). We consider each hashtag as a meme.
Results
Communities and communication volume. Do memes sp (...truncated)