Dynamics of Conflicts in Wikipedia
Citation: Yasseri T, Sumi R, Rung A, Kornai A, Kertesz J (
Dynamics of Conflicts in Wikipedia
Taha Yasseri 0
Robert Sumi 0
Andra s Rung 0
Andra s Kornai 0
Ja nos Kerte sz 0
Attila Szolnoki, Hungarian Academy of Sciences, Hungary
0 1 Department of Theoretical Physics, Budapest University of Technology and Economics, Budapest, Hungary, 2 Computer and Automation Research Institute, Hungarian Academy of Sciences , Budapest , Hungary
In this work we study the dynamical features of editorial wars in Wikipedia (WP). Based on our previously established algorithm, we build up samples of controversial and peaceful articles and analyze the temporal characteristics of the activity in these samples. On short time scales, we show that there is a clear correspondence between conflict and burstiness of activity patterns, and that memory effects play an important role in controversies. On long time scales, we identify three distinct developmental patterns for the overall behavior of the articles. We are able to distinguish cases eventually leading to consensus from those cases where a compromise is far from achievable. Finally, we analyze discussion networks and conclude that edit wars are mainly fought by few editors only.
-
New media such as the internet and the web enable entirely new
ways of collaboration, opening unprecedented opportunities for
handling tasks of extraordinary size and complexity. Such
collaborative schemes have already been used to solve challenges
in software engineering [1] and mathematics [2]. Understanding
the laws of internet-based collaborative value production is of
great importance.
Perhaps the most prominent example of such value production
is Wikipedia (WP), a free, collaborative, multilingual internet
encyclopedia [3]. WP evolves without the supervision of a
preselected expert team, its voluntary editors define the rules and
maintain the quality. WP has grown beyond other encyclopedias
both in size and in use, having unquestionably become the number
one reference in practice. Although criticism has been
continuously expressed concerning its reliability and accuracy, partly
because the editorial policy is in favor of consensus over credentials
[4], independent studies have shown that, as early as in 2005,
science articles in WP and Encyclopedia Britannica were of
comparable quality [5]. As every edit and discussion post is saved
and available, WP is particularly well suited to study
internetbased collaborative processes. Indeed, WP has been studied
extensively from different aspects including the growth of content
and community [6,7], coverage [8,9] and evolution of the
hyperlink networks [1014], the extraction of semantic networks
[1517], linguistic studies [1820], user reputation [21] and
collaboration quality [22,23], vandalism detection [2426], and
the social aspects of the editor community [2732].
Usually, different editors constructively extend each others text,
correct minor errors and mistakes until a consensual article
emerges this is the most natural, and by far the most common,
way for a WP entry to be developed [33]. Good examples include
(WP articles will be cited in typewriter font throughout the text)
Benjamin Franklin, Pumpkin or Helium. As we shall see, in the
English WP close to 99% of the articles result from this rather
smooth, constructive process. However, the development of WP
articles is not always peaceful and collaborative, there are
sometimes heavy fights called edit wars between groups
representing opposing opinions. Schneider et al. [34] estimated that in the
English WP, among the highly edited or highly viewed articles
(these notions are strongly correlated, see [35]), about 12% of
discussions are devoted to reverts and vandalism, suggesting that
the WP development process for articles of major interest is highly
contentious. The WP community has created a full system of
measures to resolve conflict situations, including the so called
three revert rule (see Wikipedia:Edit warring), locking articles
for non-registered editors, tagging controversial articles, and
temporal or final banning of malevolent editors. It is against this
rich backdrop of explicit rules, explicit or implicit regulations, and
unwritten conventions that the present paper undertakes to
investigate a fundamental part of the collaborative value
production, how conflicts emerge and get resolved.
The first order of business is to construct an automated
procedure to identify controversial articles. For a human reader
the simplest way to do so is to go to the discussion (talk) pages of
the articles, which often show the typical signatures of conflicts as
known from social psychology [36]. The length of the discussion
page could already be considered a good indicator of conflict: the
more severe the conflict, the longer the talk page is expected to be
(this will be shown in detail later). However, this feature is very
language dependent: while conflicts are indeed fought out in detail
on discussion pages in the English WP, German editors do not use
this vehicle for the same purpose. Moreover, there are WPs, e.g.
the Hungarian one, where discussion pages are always rather
sparse, rarely mentioning the actual arguments. Clearly the
discussion page alone is not an appropriate source to identify
conflicts if we aim at a general, multi-lingual, culture-independent
indicator.
Figure 1. Revert and mutual revert maps of Benjamin Franklin (left) and Israel and the apartheid analogy (right). Diagrams in upper
row show the map of all reverts, whereas only mutual reverts are depicted on the diagrams in the lower row. Nr and Nd are the number of edits
made by the reverting and reverted editors respectively. Size of the dots is proportional to the number of reverts by the same reverting and reverted
pair of editors.
doi:10.1371/journal.pone.0038869.g001
Conflicts in WP were studied previously both on the article and
on the user level. Kittur et al. [37,38] and Vuong et al. [39]
measured controversiality by counting the controversial tag in
the history of an article, and compared other possible metrics to
that. It should be noted, however, that this is at best a one-sided
measure as highly disputed pages such as Gdansk or Euthanasia in
the English WP lack such tags, and the situation is even worse in
other WPs. In [38], different page metrics like the number of
reverts, the number of revisions etc. were compared to the tag
counts and in [39] the number of deleted words between users
were counted and a Mutual Reinforcement Principle [40] was
used to measure how controversial a given article is. Clearly, there
are several features of an article which correlate with its
controversiality, making it highly non-trivial to choose an
appropriate indicator. Some papers try to detect the negative
conflict links between WP editors in a given article and, based
on this, attempt to classify editors into groups. The main idea of
the method used by Kittur et al. [38] is to relate (...truncated)