Agreeable Data

How Wikipedian consensus is conceptualized by computer science researchers

Wikipedia has been a ready-made available source of easily accessible, downloadable, and analyzable data for computer scientists.

Often, it is precisely because of Wikipedia's reliance on consensus as a practice/mechanism for producing knowledge that this data is perceived as reliable.

  • How do computer scientists approach Wikipedian consensus in their work?
  • Which phenomena do computer scientists assume consensus-made data can trace?
  • What are the political and epistemological consequences of these perceptions?

Jimmy Wales

the Wikipedia governance model, the governance of the community, is a very confusing, but workable mix of consensus -- meaning we try not to vote on the content of articles, because the majority view is not necessarily neutral -- some amount of democracy -- all of the administrators -- these are the people who have the ability to delete pages.
The Birth of Wikipedia, July 2005 TED TALK

  • 2004: The first version of the consensus policy states "Wikipedia is a consensus."
  • 2005: English Wikipedia policy template states that all changes to policies "reflect" consensus.
  • 2007: Consensus was enshrined in Wikipedia's Five Pillars.

English Wikipedians

Contradicting definitions

  • The result of the asynchronous aggregation of rounds of individual edits.
  • A deliberative process of talking rationally and with civility.
  • The active participation of all members to make a cohesive community.
  • The judgement of admins and third parties to determine which decision to make.
Jankowski, S. (2022). Making consensus sensible: The transition of a democratic ideal into Wikipedia's interface. Journal of Peer Production.

Jankowski, S. (2015). No consensus on consensus: A paradox within Wikipedian governance and collective action. In Multitude to Crowds in Social Movements: publics, gatherings, networks and media in the 21th century, Lisbon, Portugal, 2015.

Content Analysis

Computer science-related
journal articles

Data collection

  1. Selective sampling: Queried Scopus for "Wikipedia" and "consensus"
    N=108
  2. Selection: Manually removed non-computer science-related journals
    Corpus = 24 articles from 2007–2020
  3. Primary unit of analysis: Sentences containing the word "consensus."
    Instances = 396
  4. Secondary unit of analysis: Statements about the purpose of the research and Wikipedia.

Sub-questions

  • How does each article describe consensus?
  • What is the goal of the research?
  • What role does Wikipedia play in the study?
  • Which parts of Wikipedia were used for analysis?

Consensus is about shared agreement, community norms and negotiating power.

This research is interested in the specifics of how Wikipedian's govern themselves.


    Example:
  • Interpreting how Wikipedian consensus gradually converts social mechanisms into algorithmic mechanisms using 100 policy-, discussion-, and bot-management-pages in the German and English Wikipedia, and bot data
Policy and composite definition: Community consensus converts existing social norms into a software feature, such as bots. Consensus is not vote-counting or majority rule, whereas surveys are used to determine consensus.

Consensus is the eventual agreement and desired result after periods of editorial conflicts or argumentative discussions.

This research is interested in generalizing and predicting how consensus emerges on articles and within talk pages.


    Example
  • Modeling Wikipedia's short term and long term editorial conflicts using the January 2010 dump of the English WP filtered for controversial articles (223 k articles).
Composite definition: Consensus is a permanent or temporary compromise that is reached at the end of a warring period within controversial articles.

Consensus is either self-evident or it is the combination of decision-making and discussion.

This research is interested in how to make Wikipedian-related knowledge management decision-making processes more efficient.


    Example
  • Automating Wikipedia’s Featured Article Candidate nomination process to predict outcomes using all FAC sessions from January 2004 to August 2008.
Composite and self definition: Consensus is both a decision-making principle and a practice within discussions to determine an outcome that "implies majority agreement" through voting.

Consensus produces facts or relationships between data.

This research is interested in exploiting Wikipedia's structured data to either be a testing ground or as training data for a consensus algorithm.


    Example
  • Harvesting Wikipedia's infoboxes as ground truth data to validate quantity consensus queries using infoboxes as numeric attributes for Wikipedia entities.
Composite algorithmic definition: Consensus is a useful aggregation that can be collected and quantified for the purposes of providing correct and high quality answers to queries.

Alternative readings
of consensus

Understanding

  • Exhaustive
  • Slow deliberations (talk)
  • Produces community
  • Consensus / dissensus

Decisions made after understanding are more democratically legitimate.

But understanding is not necessary for all democratic decisions, especially in a crisis.

Decision-making

  • Responsive
  • Quick operations (vote)
  • Produces actions
  • Majorities / minorities

Jezierska, K. (2019). With Habermas against Habermas: Deliberation without consensus. Journal of Public Deliberation, 15_(1), 13.
Jankowski, S. (2024). Consensus techniques. Internet Policy Review, 13(2), 1-9.

Political

  • Digital utopianism: Decentralized and distributed technologies are presumed to be inherently "good" because they are designed to end with agreement.
  • Political inattention:
    It was rare to see an acknowledgment of necessity of dissensus to complements consensus, especially for a self-governing community.

Epistemological

  • Knowledge extractivism: Facts are assumed to represent the result of consensus, but the variety of techniques, unevenness of participation, and power-plays are cut during the "harvest."
  • A view from nowhere: Despite studying/using consensus, it is positioned as a self-evident concept and is rarely defined using secondary literature.

Thanks!