July 1, 2024

The Biggest Problem in Mathematics Is Finally a Step Closer to Being Solved

Number theorists have been trying to prove a conjecture about the distribution of prime numbers for more than 160 years

By Manon Bischoff

Abstract purple lines funnelling towards the right with white dotted light sources becoming smaller towards the right.

Weiquan Lin/Getty Images

The Riemann hypothesis is the most important open question in number theory—if not all of mathematics. It has occupied experts for more than 160 years. And the problem appeared both in mathematician David Hilbert’s groundbreaking speech from 1900 and among the “Millennium Problems” formulated a century later. The person who solves it will win a million-dollar prize.

But the Riemann hypothesis is a tough nut to crack. Despite decades of effort, the interest of many experts and the cash reward, there has been little progress. Now mathematicians Larry Guth of the Massachusetts Institute of Technology and James Maynard of the University of Oxford have posted a sensational new finding on the preprint server arXiv.org. In the paper, “the authors improve a result that seemed insurmountable for more than 50 years,” says number theorist Valentin Blomer of the University of Bonn in Germany.

Other experts agree. The work is “a remarkable breakthrough,” mathematician and Fields Medalist Terence Tao wrote on Mastodon , “though still very far from fully resolving this conjecture.”

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing . By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

The Riemann hypothesis concerns the basic building blocks of natural numbers: prime numbers, values only divisible by 1 and themselves. Examples include 2, 3, 5, 7, 11, 13, and so on.

Every other number, such as 15, can be clearly broken down into a product of prime numbers: 15 = 3 x 5. The problem is that the prime numbers do not seem to follow a simple pattern and instead appear randomly among the natural numbers. Nineteenth-century German mathematician Bernhard Riemann proposed a way to deal with this peculiarity that explains how prime numbers are distributed on the number line—at least from a statistical point of view.

A Periodic Table for Numbers

Proving this conjecture would provide mathematicians with nothing less than a kind of “periodic table of numbers.” Just as the basic building blocks of matter (such as quarks, electrons and photons) help us to understand the universe and our world, prime numbers also play an important role, not just in number theory but in almost all areas of mathematics.

There are now numerous theorems based on the Riemann conjecture. Proof of this conjecture would prove many other theorems as well—yet another incentive to tackle this stubborn problem.

Interest in prime numbers goes back thousands of years. Euclid proved as early as 300 B.C.E. that there are an infinite number of prime numbers. And although interest in prime numbers persisted, it was not until the 18th century that any further significant findings were made about these basic building blocks.

As a 15-year-old, physicist Carl Friedrich Gauss realized that the number of prime numbers decreases along the number line. His so-called prime number theorem (not proven until 100 years later) states that approximately n / ln( n ) prime numbers appear in the interval from 0 to n . In other words, the prime number theorem offers mathematicians a way of estimating the typical distribution of primes along a chunk of the number line.

The exact number of prime numbers may differ from the estimate given by the theorem, however. For example: According to the prime number theorem, there are approximately 100 / ln(100) ≈ 22 prime numbers in the interval between 1 and 100. But in reality there are 25. There is therefore a deviation of 3. This is where the Riemann hypothesis comes in. This hypothesis gives mathematicians a way to estimate the deviation. More specifically, it states that this deviation cannot become arbitrarily large but instead must scale at most with the square root of n , the length of the interval under consideration.

The Riemann hypothesis therefore does not predict exactly where prime numbers are located but posits that their appearance on the number line follows certain rules. According to the Riemann hypothesis, the density of primes decreases according to the prime number theorem, and the primes are evenly distributed according to this density. This means that there are no large areas in which there are no prime numbers at all, while others are full of them.

You can also imagine this idea by thinking about the distribution of molecules in the air of a room: the overall density on the floor is somewhat higher than on the ceiling, but the particles—following this density distribution—are nonetheless evenly scattered, and there is no vacuum anywhere.

A Strange Connection

Riemann formulated the conjecture named after him in 1859, in a slim, six-page publication (his only contribution to the field of number theory). At first glance, however, his work has little to do with prime numbers.

He dealt with a specific function, the so-called zeta function ζ( s ), an infinitely long sum that adds the reciprocal values of natural numbers that are raised to the power of s :

The zeta function

Even before Riemann’s work, experts knew that such zeta functions are related to prime numbers. Thus, the zeta function can also be expressed as a function of all prime numbers p as follows:

The zeta function as a function of all prime numbers

Riemann recognized the full significance of this connection with prime numbers when he used not only real values for s but also complex numbers. These numbers contain both a real part and roots from negative numbers, the so-called imaginary part.

You can imagine complex numbers as a two-dimensional construct. Rather than mark a point on the number line, they instead lie on the plane. The x coordinate corresponds to the real part and the y coordinate to the imaginary part:

The coordinates of z = x + iy illustrate a complex number

Никита Воробьев/Wikimedia

The complex zeta function that Riemann investigated can be visualized as a landscape above the plane. As it turns out, there are certain points amid the mountains and valleys that play an important role in relation to prime numbers. These are the points at which the zeta function becomes zero (so-called zeros), where the landscape sinks to sea level, so to speak.

A visual mapping of the zeta function looks like a mountainscape with peaks and troughs

The colors represent the values of the complex zeta function, with the white dots indicating its zeros.

Jan Homann/Wikimedia

Riemann quickly found that the zeta function has no zeros if the real part is greater than 1. This means that the area of the landscape to the right of the straight line x = 1 never sinks to sea level. The zeros of the zeta function are also known for negative values of the real part. They lie on the real axis at x = –2, –4, –6, and so on. But what really interested Riemann—and all mathematicians since—were the zeros of the zeta function in the “critical strip” between 0 ≤ x ≤ 1.

The dark blue area demarcates a stretch along the x axis where the Riemann zeta function contains nontrivial zeros

In the critical strip (dark blue), the Riemann zeta function can have “nontrivial” zeros. The Riemann conjecture states that these are located exclusively on the line x = 1/2 (dashed line).

LoStrangolatore/Wikimedia ( CC BY-SA 3.0 )

Riemann knew that the zeta function has an infinite number of zeros within the critical strip. But interestingly, all appear to lie on the straight line x = 1 / 2 . Thus Riemann hypothesized that all zeros of the zeta function within the critical strip have a real part of x = 1 / 2 . That statement is actually at the crux of understanding the distribution of prime numbers. If correct, then the placement of prime numbers along the number line never deviates too much from the prime number set.

On the Hunt for Zeros

To date, billions and billions of zeta function zeros have now been examined— more than 10 13 of them —and all lie on the straight line x = 1 / 2 .

But that alone is not a valid proof. You would only have to find a single zero that deviates from this scheme to disprove the Riemann hypothesis. Therefore we are looking for a proof that clearly demonstrates that there are no zeros outside x = 1 / 2 in the critical strip.

Thus far, such a proof has been out of reach, so researchers took a different approach. They tried to show that there is, at most, a certain number N of zeros outside this straight line x = 1 / 2 . The hope is to reduce N until N = 0 at some point, thereby proving the Riemann conjecture. Unfortunately, this path also turns out to be extremely difficult. In 1940 mathematician Albert Ingham was able to show that between 0.75 ≤ x ≤ 1 there are at most y 3/5+ c zeros with an imaginary part of at most y , where c is a constant between 0 and 9.

In the following 80 years, this estimation barely improved. The last notable progress came from mathematician Martin Huxley in 1972 . “This has limited us from doing many things in analytic number theory,” Tao wrote in his social media post . For example, if you wanted to apply the prime number theorem to short intervals of the type [ x , x + x θ ], you were limited by Ingham’s estimate to θ > 1 / 6 .

Yet if Riemann’s conjecture is true, then the prime number theorem applies to any interval (or θ = 0), no matter how small (because [ x , x + x θ ] = [ x , x + 1] applies to θ = 0).

Now Maynard, who was awarded the prestigious Fields Medal in 2022 , and Guth have succeeded in significantly improving Ingham’s estimate for the first time. According to their work, the zeta function in the range 0.75 ≤ x ≤ 1 has at most y (13/25)+ c zeros with an imaginary part of at most y . What does that mean exactly? Blomer explains: “The authors show in a quantitative sense that zeros of the Riemann zeta function become rarer the further away they are from the critical straight line. In other words, the worse the possible violations of the Riemann conjecture are, the more rarely they would occur.”

“This propagates to many corresponding improvements in analytic number theory,” Tao wrote . It makes it possible to reduce the size of the intervals for which the prime number theorem applies. The theorem is valid for [ x , x + x 2/15 ], so θ > 1 / 6 = 0.166... becomes θ > 2 ⁄ 15 = 0.133...

For this advance, Maynard and Guth initially used well-known methods from Fourier analysis for their result. These are similar techniques to what is used to break down a sound into its overtones. “The first few steps are standard, and many analytic number theorists, including myself, who have attempted to break the Ingham bound, will recognize them,” Tao explained . From there, however, Maynard and Guth “do a number of clever and unexpected maneuvers,” Tao wrote.

Blomer agrees. “The work provides a whole new set of ideas that—as the authors rightly say—can probably be applied to other problems. From a research point of view, that’s the most decisive contribution of the work,” he says.

So even if Maynard and Guth have not solved Riemann’s conjecture, they have at least provided new food for thought to tackle the 160-year-old puzzle. And who knows—perhaps their efforts hold the key to finally cracking the conjecture.

This article originally appeared in Spektrum der Wissenschaft and was reproduced with permission.

  • Tools and Resources
  • Customer Services
  • Communication and Culture
  • Communication and Social Change
  • Communication and Technology
  • Communication Theory
  • Critical/Cultural Studies
  • Gender (Gay, Lesbian, Bisexual and Transgender Studies)
  • Health and Risk Communication
  • Intergroup Communication
  • International/Global Communication
  • Interpersonal Communication
  • Journalism Studies
  • Language and Social Interaction
  • Mass Communication
  • Media and Communication Policy
  • Organizational Communication
  • Political Communication
  • Rhetorical Theory
  • Share This Facebook LinkedIn Twitter

Article contents

Digital journalism and epistemologies of news production.

  • Rodrigo Zamith Rodrigo Zamith Journalism Department, University of Massachusetts Amherst
  •  and  Oscar Westlund Oscar Westlund Department of Journalism and Media Studies, Oslo Metropolitan University
  • https://doi.org/10.1093/acrefore/9780190228613.013.84
  • Published online: 18 July 2022

News is the result of news production, a set of epistemic processes for developing knowledge about current events or issues that draw upon a range of newsgathering techniques and formatting choices with the objective of yielding a publishable and distributable product designed to inform others. That process, however, has changed considerably over time and in parallel to broader economic, political, professional, social, and technological changes. For example, during the past two decades alone, there has been greater audience fragmentation and an emphasis on audience measurement, new forms of strategic exploitation of information channels and digital surveillance of journalists, greater aggregation of news and more avenues for professional convergence, a media environment awash in user-generated content and challenges to traditional outlets’ epistemic authority, and more opportunities for interactivity and miniaturized mobilities. In concert, these and other forces have transformed news production processes that have become increasingly digital—from who the actors are to the actants that are available to them, the activities they may engage in, and the audiences they can interact with.

Such impacts have required scholars to revisit different theories that help explain how news is produced and with what consequences. Whereas the field of journalism studies draws on a rich history of multidisciplinary theorizing, epistemologies of journalism have received increased attention in recent years. There is a close link between news production and epistemology because the production of news inherently involves developing news information into one form of knowledge. As such, an epistemological lens allows scholars to examine the production, articulation, justification, and use of knowledge within the social context of digital journalism. An analytic matrix of 10 dimensions—the epistemologies of journalism matrix—helps scholars examine different forms of journalism through an epistemological lens. The matrix focuses on identifying the key (a) social actors, (b) technological actants, and (c) audiences within a space of journalism; examining their articulation or justification of (d) knowledge claims and their distinct (e) practices, norms, routines, and roles; differentiating between the (f) forms of knowledge they typically convey; and evaluating the similarities and dissimilarities in their typical (g) narrative structure, (h) temporality, (i) authorial stance, and (j) status of text.

By applying that matrix to four emerging forms of journalism (participatory journalism, live blogging, data journalism, and automated journalism), it can be seen that digital journalism and news production are becoming even more heterogeneous in terms of their implicated entities, cultures and methods, and positionality in relation to matters of knowledge and authority. First, contemporary news production is deeply influenced by myriad technological actants, which are reshaping how knowledge about current events is being created, evaluated, and disseminated. Second, professional journalists are losing epistemic authority over the news as key activities are delegated to algorithms created by non-journalists and to citizens who have become more present in news production. Third, the outputs of news production are becoming more diverse both in form and in content, further challenging long-standing norms about what is and is not “journalism.” In short, history has shown that news production will continue to evolve, and an epistemological lens affords scholars a useful and adaptable approach for understanding the implications of those changes to the production of knowledge about news.

  • digital journalism
  • news production
  • epistemology
  • data journalism
  • participatory journalism
  • live blogging
  • automated journalism
  • social media

Introduction

News , or “public knowledge claiming to report on current events in the world” ( Westlund & Ekström, 2018 , p. 3), is more pervasive in citizens’ lives today than ever before. It may be accessed around the clock and in a multitude of ways, including through typical reading, watching, viewing, and listening activities as well as newer “snacking” and monitorial activities such as scrolling through headlines while waiting in the elevator ( Costera Meijer & Groot Kormelink, 2015 ). Those activities may be performed actively via acts such as searching or passively via exposure coordinated by algorithmic recommendation systems. News itself may be accessed through a wider range of media and digital platforms and from a larger multitude of sources. These include legacy news media and digital news start-ups working in and for local, regional, national, or international contexts ( Ali et al., 2019 ; Chua & Duffy, 2019 ; Heft & Dogruel, 2019 ), as well as citizen journalists ( Kim & Lowrey, 2015 ) and alternative news media ( Figenschou & Ihlebæk, 2019 ; Holt et al., 2019 ).

News is neither a “given” nor a necessarily stable object, however. It is the result of news production , defined here as the epistemic processes for developing knowledge about current events or issues that draw upon a range of newsgathering techniques and formatting choices with the objective of yielding a publishable and distributable product designed to inform others. This definition highlights the intrinsic link between news and epistemology, as news can be distilled into different forms of knowledge about the world ( Ekström & Westlund, 2019a ; Ekström et al., 2021 ; Nielsen, 2017 ; Zelizer, 1993 ). It also underscores that news is necessarily shaped by activities such as sourcing and filtering information ( Domingo et al., 2008 ), which may be produced by human actors or technological actants ( Lewis & Westlund, 2015a ) and further formatted with particular platforms ( Hågvar, 2019 ; Westlund, 2013 ) and audiences in mind ( Weischenberg & Matuschek, 2008 ). Finally, it recognizes the close link between news production and distribution―which, indeed, may sometimes occur simultaneously as in the case of broadcasting, live tweeting, and producing newsletters―while acknowledging that the latter is most often examined as a separate, subsequent step (for a detailed examination of news distribution, see Braun, 2019 ; see also Hermida, 2020 ; Wallace, 2018 ).

Adopting an epistemological lens allows scholars to recognize that news is often contested and that much of the contestation occurs implicitly―and sometimes explicitly―along epistemic lines, as with critiques about the veracity of a given news account and allegations of bias ( Carlson, 2017 ; Compton & Benedetti, 2010 ). This lens also allows scholars to be mindful of the fact that news varies in substance and form between genres, across platforms, and depending on epistemic processes and formats for publishing ( Ekström & Westlund, 2019b ). In short, news is the result of a dynamic and heterogeneous process.

This article aims to capture that dynamism in order to illustrate the evolution of digital news production, particularly since the turn of the 21st century and mostly in relation to Western journalistic practices. The article therefore does not review the emergence and growth of some important research into news production from the mid-20th century, including the influential work produced by the likes of Herbert Gans, Gaye Tuchman, and Pamela Shoemaker and Stephen Reese. Such work is aptly reviewed by Hanusch and Maares (2021) , who describe it as part of a wave of scholarship that illustrate the importance of news routines; the role of intra-, inter-, and extra-organizational relationships; and strategic rituals in shaping news production processes and, consequently, news products (see also Westlund & Ekström, 2019 ). However, for expediency, this article instead focuses on epistemologies of digital news production, recognizing that present ideas about journalistic knowledge production are shaped by past work.

The article begins with a synthesis of the significant economic, political, professional, social, and technological developments that have played a structuring role in the developments of news production in the field, such as the proliferation of mobile devices and organized disinformation campaigns. Then, it describes some of the key theories that have been used to study news production, centering on an epistemological lens that emphasizes its rhetorical, practical, and evaluative elements. Next, the article systematically examines four emerging forms of journalistic news production―which is characterized by “ambitions toward the publishing of truthful accounts of current events in the world” ( Westlund & Ekström, 2018 , p. 3)―through an epistemological lens. That examination focuses on participatory journalism, live blogging, data journalism, and automated journalism because they are not only marked by some novelty and represent rapidly evolving forms of journalism but also associated with a significant amount of recent scholarship that merits synthesis. The article concludes with a discussion in which it is argued that scholars can only go so far in understanding news and journalism by focusing on who does journalism or what the news materials produced are, and that the examination of epistemic practices proves a worthwhile addition to that endeavor.

Key Shifts in the 21st Century

The news production process has changed considerably over time and in parallel to broader economic, political, professional, social, and technological changes ( Barnhurst, 2011 ; Braun, 2015 ; Bruns, 2008 ; Fenton, 2011 ; Hanusch & Maares, 2021 ; Napoli, 2011 ; Westlund & Quinn, 2018 ; Zelizer, 2019 ). Indeed, as scholars have observed, entities typically regarded as being outside the space of journalism can play a major role at particular points of its development. For example, the U.S. Postal Service played a crucial role in creating the distribution infrastructure for newspapers during the early U.S. republic and, in turn, not only helped shape U.S. news processes but also created a sense of national identity and belonging among the citizens of the emerging nation ( John, 1995 ). Although chronicling all the changes that have impacted news production is not possible within a single article, 20 particularly consequential shifts since the turn of the century are highlighted here to illustrate how news production has been transformed alongside changing forces. These forces are grouped for illustrative purposes, recognizing that some transcend simple categories—that is, they may be simultaneously economic and technological, and so on.

Economically, today’s news environment is characterized by greater audience fragmentation , which refers to the process through which (or the phenomenon in which) mass audiences are split into more diffuse and specialized groups in their media consumption ( Neuman, 1991 ). That has promoted specialization in the production of news and the creation of niche outlets to satisfy new and narrower segments ( Napoli, 2011 ). Similarly, there is now greater emphasis on audience measurement , or the process of quantifying, analyzing, and synthesizing information about individuals’ content preferences and how they interact with that content ( Napoli, 2011 ; Tandoc, 2019 ). The current emphasis on measurement is enabled largely by audience analytics, which provides more (and more detailed) data about audience behaviors and facilitates keying journalistic products to audience demands ( Zamith, 2018 ). The nature of commodification , or the transformation of a good or service into a product that can be sold for profit within a market ( Hamilton, 2004 ), has also changed since the turn of the 21st century due to the unbundling of news products, rise of non-journalistic platforms, and increase in competition from platform companies ( Steensen & Westlund, 2021 ) as well as alternative news media ( Holt et al., 2019 ). This has resulted in pressures for journalists to do more with less and a renewed emphasis on subscription-based and nonprofit economic models ( Pickard, 2020 ). Economic conditions have also resulted in greater occupational precarity , or deteriorating professional conditions that lead to insecure labor conditions ( Örnebring, 2018 ). This situation is characterized by a growing dependence on unpaid labor and outsourced workers, less full-time work, and a more general fear of indiscriminate layoffs, constraining journalists’ ability to adhere to journalistic ideals and remain autonomous ( Örnebring, 2018 ).

Politically, journalists must contend with greater amounts of disinformation , or information that is deliberately false or misleading ( Jack, 2017 ). A range of actors―including state-sponsored groups―have sought to sow disinformation by strategically exploiting trustworthy information channels and outright impersonating trusted news brands, forcing journalists to rethink how they verify information within a speed-oriented craft while also further complicating eroding trust in journalism ( Marwick & Lewis, 2017 ). News media have been subjected to changes in regulations , or rules, laws, and codes prescribed by some authority, typically a government ( Flew & Swift, 2013 ). Scholars have observed that Western countries have generally moved toward overall deregulation, resulting in greater corporate ownership and emphasis on consumer-oriented journalism ( Fenton, 2011 ). However, they have also observed substantially different approaches taken within European countries, where journalists and news organizations sometimes have access to direct, government-sponsored grants and indirect subsidies, and where news audiences often have access to robust public service broadcasters ( Murschetz, 2020 ). Such approaches are markedly different from those in the United States, where the primary government support mechanism is frequently just a general tax break for nonprofit entities and donors ( Pickard, 2020 ). In addition, many countries still operate under strict information control regimes that limit what journalists can publish ( Xu, 2015 ). There is also now greater digital surveillance of journalists, which enables an actor, such as a government, to use digital tools to continuously monitor the activities of another actor ( Ataman & Çoban, 2018 ). Indeed, journalists―and investigative journalists in particular―increasingly report serious concerns about being tracked, leading to some self-censorship and increased difficulty getting confidential sources to share information ( Lashmar, 2017 ). There have been changes in the amounts and types of state subsidies for news media, or the direct aid provided by governments to support the activities of independent organizations ( Kreiss & Ananny, 2013 ). Although public support for such media remains high, their subsidies have been repeatedly reduced or threatened in recent years ( Fenton, 2011 ), and the lack of subsidies in some countries has resulted in the development of “news deserts” as commercial models have faltered ( Pickard, 2020 ).

Professionally, there is now greater aggregation of news, or the practice of manually or algorithmically bringing together information from different products into a single one, typically based on some curation criteria ( Bakker, 2012 ). This has resulted in the proliferation of news aggregator sites such as Google News and apps such as Apple News that do not originate news but serve as competitors and key audience brokers by virtue of their strategic position within the contemporary news landscape ( Coddington, 2019 ). Such aggregators tend to promote freely accessible content, making shifts away from ad-supported news more challenging. Similarly, there are new forms of convergence , or the integration of previously distinct media components and technologies to create new organizational forms and processes ( Pavlik, 2004 ). This has promoted internal collaboration across a news organization’s departments ( Nielsen, 2012 ) as well as external collaboration with non-media partners (e.g., Hacks/Hackers; see Usher, 2014 ). It has also promoted a digital-first ethos, where newsworkers are expected to quickly produce content for online environments and engage through social media platforms in ways that challenge traditional journalistic ethics ( Singer, 2012 ). This shift exacerbated the continuous deadline pressures introduced by live broadcasts, accelerated by 24-hour news, and taken to a new level with the “death of the deadline” in online news, resulting in time-obsessed and stepped-up news cycles that emphasize temporal competition and improvisatory practice ( Barnhurst, 2011 ). Recent research on breaking news has also shown how journalists take timing into consideration, and are mindful of when to release their stream of online news ( Ekström et al., 2021 ).

Socially, journalists now operate in a media environment filled with user-generated content , or non-journalistic content created by active audience members that is typically published online and accessible at negligible cost ( Jönsson & Örnebring, 2010 ). This has enabled outsiders to enter journalism, provided new content subsidies for news organizations, created new competitors within a competitive attention economy, and challenged news professionals’ gatekeeping powers ( Bruns, 2008 ). Similarly, there are now more opportunities for dark participation , or antisocial forms of online participation that include harassment, trolling, and “doxxing,” which refers to the practice of publicly revealing private, and often sensitive, information about an individual or organization ( Quandt, 2018 ). Such participation induces some journalists to self-censor, withdraw from public spaces, or quit the profession altogether—and disproportionately affects journalists from historically marginalized communities ( Lewis et al., 2020 ; Stahel & Schoen, 2020 ). These developments have paralleled (and driven) challenges to traditional epistemic authority , or an entity’s socially accepted “power to define, describe, and explain bounded domains of reality” ( Gieryn, 1999 , p. 1). Journalists in many areas of the world must now cope with low and/or declining levels of trust in media, as well as eroding control over information ( Fletcher & Park, 2017 ). News production is also conditioned by placeification , or the shaping of an artifact by the places in which it is produced, practiced, and consumed ( Gutsche & Hess, 2020 ). In many countries, news production now occurs primarily in large, urban centers as a result of broader societal place-based realignments, with consequences for trust in non-urban centers and for journalists to witness certain events firsthand ( Radcliffe & Ali, 2017 ). Indeed, as Schmitz Weiss (2015 , p. 127) contends, “location plays a significant role in how communities function and how they see themselves,” and scholars have argued that structural inequalities and political polarization in places such as the United States have taken on a place-based dimension as a result of broader social, economic, technological, and professional shifts (e.g., Usher, 2019 ).

Technologically, journalists work within an environment characterized by greater interactivity , which refers to the technological attributes of mediated environments that allow users to connect with and through technology ( Bucy, 2004 ). News consumers now expect to be able to interact with news content, whether through responsive websites or dynamic products such as interactive data visualizations ( Zamith, 2019b ). In addition, news organizations now routinely use external hyperlinks as reference tools, which in turn can promote transparency ( Sjøvaag et al., 2019 ) and contribute to heterogeneous news flows and inter-media connectivity ( Steensen & Eide, 2019 ). The current environment is also marked by miniaturized mobilities , or information and communication technologies designed to fit a mobile lifestyle, such as smartphones and smartwatches ( Elliott & Urry, 2010 ). These mobilities have enabled journalists to work outside the newsroom in more diverse and effective ways (see also Duffy et al., 2020 ; Westlund & Quinn, 2018 ). Social media , or platforms that allow users to traverse a network of contacts via contributions such as posts and tweets ( boyd & Ellison, 2007 ), have enabled journalists to adopt new practices such as ambient journalism to find novel stories and potentially draw upon a larger range of sources ( Hermida et al., 2014 ). They have also substantially altered how information spreads ( Swart et al., 2019 ). More broadly, however, the space of journalism is now characterized by an immense number of transparent intermediaries , or actors and actants that exert a structuring role in media production and distribution yet are unseen by most media consumers ( Braun, 2015 ). These include algorithmic recommendation tools that shape individuals’ exposure to content―both in terms of what journalists see and which of their work gets seen by news consumers―and promotes practices such as search engine optimization of headlines ( Gillespie, 2014 ). Notably, throughout the 2010s, many publishers aimed to build a presence on social media platforms ( Steensen & Westlund, 2021 ). However, amid growing concerns about their loss of power and revenue in the long term ( Nielsen & Ganter, 2018 ), some publishers have shifted toward platform counterbalancing ( Chua & Westlund, 2019 ).

In concert, these economic, political, professional, social, and technological forces have transformed multiple aspects of journalism and in particular have had material impact on news production—from who the actors are to the actants that are available to them, the activities they may engage in, and the audiences they can interact with (see Lewis & Westlund, 2015a ). Such impacts have required scholars to revisit different theories that help explain how news is produced and with what consequences.

Theorizing News Production

There is a long tradition of theorizing news production, much of which draws heavily on psychology, sociology, political economy, and cultural studies (see Ahva & Steensen, 2019 ; Hanusch & Maares, 2021 ; Steensen & Westlund, 2021 ). An early and enduring example is gatekeeping theory , which refers to the process through which actors or actants (gatekeepers) can include or exclude information before it reaches an audience—as with a newspaper editor who chooses which newswire stories to include and exclude ( White, 1950 ). Theorizing from this stream has since argued that such decisions are the product of professional socialization and structural constraints, including the inculcation of news values, practices, and norms ( Vos & Heinderyckx, 2015 ).

Similarly, scholars of journalism have drawn on institutional theory throughout the years to contend that institutions—typically defined as meso-level variables such as beliefs, norms, and formal rules—mediate the relationship between macrostructures such as journalism and the micro-actions of individuals or organizations ( Cook, 1998 ). This line of thinking has proved fruitful in explaining the uniformity in certain aspects of news production and the often cautious responses to disruption and uncertainty ( Lowrey, 2011 ). This theoretical perspective broadly shares key tenets with field theory , as proposed by Bourdieu (1993) , which has proven particularly influential in recent scholarship (e.g., Wu et al., 2019 ). That perspective imagines society as being composed of multiple “fields” (with journalism being one of them) that have field-specific norms, traditions, and practices that shape behavior but are themselves shaped through their intersection with other fields as well as broader cultural, economic, and political forces ( Bourdieu, 1993 ). Such theorizing has opened avenues for examining cultural resources that, for example, lend greater social legitimacy to certain news production actors and activities over others ( Benson, 2006 ).

These examples illustrate but one, primarily sociological, stream of theories that have been applied to the study of news production (for a broader collection, see Ahva & Steensen, 2019 ; Hanusch & Maares, 2021 ; Steensen & Westlund, 2021 ). However, they are also illustrative in that they have all been developed and occasionally recast in some manner in response to the aforementioned economic, political, professional, social, and technological developments. Indeed, as Wallace (2018) wrote while aiming to remodel gatekeeping theory, sociotechnical developments have “changed gatekeeping selection processes and news flow patterns. Accordingly, gatekeeping theory must also change” (p. 275).

This article is centered on a lens that has garnered increased attention in recent years: epistemologies of journalism ( Ekström & Westlund, 2019a ). There is a close link between news production and epistemology because the production of news inherently involves developing news information into one form of knowledge. Indeed, the very existence of journalistic authority is largely dependent on a public’s perception that journalism―or some entities within it―offers valuable and unique public knowledge ( Carlson, 2017 ). Moreover, scholars have long contended that journalists are members of interpretive communities that are united by shared meanings about news production and the practice of collectively interpreting key events ( Zelizer, 1993 ).

An epistemological lens focuses on understanding the production, articulation, justification, and use of knowledge within the social context of journalism ( Ekström, 2002 ). In other words, it helps scholars examine what newsworkers know, how they know it, and how they justify their accounts―“the news”―as a form of knowledge ( Ekström & Westlund, 2019a ). This has required scholars to revisit who produces journalism, what epistemic values and activities are accepted as being journalistic, and how those constellations produce distinct forms of journalism―each with sufficiently different epistemological processes and claims.

As Lewis and Westlund (2015a ) argue, digital journalism involves a larger and more heterogeneous set of social actors, technological actants, and audiences than ever before. The boundaries that help establish who is a journalist have blurred considerably, with individuals previously at journalism’s periphery now considered central to its enactment ( Belair-Gagnon & Holton, 2018 ). Some news production is already automated, raising questions about the nonhuman and nonjournalistic epistemic logics and processes imbued in the associated algorithms ( Diakopoulos, 2019 ). Audiences have also changed in terms of how they are imagined, constituted, distributed, and measured, complicating how journalists come to understand (and aim to service) what they perceive to be needs of diverse audiences ( Napoli, 2011 ; Tandoc, 2019 ; Zamith, 2018 ).

Ekström and Westlund (2019a ) observe that research on epistemic values and activities within journalism have centered on three interrelated aspects. The first focuses on how journalistic knowledge claims and epistemic authority are articulated in discourse and through texts. The second draws on journalists’ narrated reflections of their practices, norms, and routines to examine how they think about and enact different epistemic notions. The third evaluates how journalistic knowledge claims are justified in news products and the extent to which they are accepted, rejected, or remixed by those who consume them.

Although news is sometimes treated as homogeneous―especially in statistical modeling that reduces it to a single endogenous or exogenous variable―the scholarship clearly observes that it is instead quite heterogeneous as a result of distinct news production practices, objectives, and constellations. Nielsen (2017) helps illustrate one degree of epistemological divergence in outlining three different forms of knowledge that can be conveyed through digital journalism: news-as-impression, or decontextualized snippets of information as with brief news alerts; news-as-item, or typical-length news articles and video news reports about news episodes; and news-about-relations, or in-depth, explanatory, and durable news products that aim to show the bigger picture. Matheson and Wahl-Jorgensen (2020) also point to five key aspects for distinguishing between types of journalism: narrative structure, or the way in which information is organized; temporality, or how time is accounted for; journalistic role, or the responsibilities, values, and objectives of the news product; authorial stance, or the journalist’s perspective on conventions such as objectivity and balance; and status of text, or whether the product is treated as a finished or evolving product.

Drawing on this literature, this article attempts to explicate the epistemologies of news production through a matrix of 10 dimensions referred to as the epistemologies of journalism matrix . This matrix focuses on identifying the key (a) social actors , (b) technological actants , and (c) audiences within a space of journalism; examining their articulation or justification of (d) knowledge claims and their distinct (e) practices , norms , routines , and roles ; differentiating between the (f) forms of knowledge they typically convey; and evaluating the similarities and dissimilarities in their typical (g) narrative structure , (h) temporality , (i) authorial stance , and (j) status of text .

Epistemologies of News Production

To illustrate the heterogeneity of news production and the value of evaluating its epistemologies through the 10 aforementioned dimensions, the dimensions are applied to four emerging forms of journalism: participatory journalism, live blogging, data journalism, and automated journalism (Table 1 ; see also Ekström & Westlund, 2019a ). Scholars are encouraged to build upon the epistemologies of journalism matrix by incorporating additional forms of journalism.

Table 1. Epistemologies and Different Forms of Journalism

Traditional Journalism

Participatory Journalism

Live Blogging

Data Journalism

Automated Journalism

Social actors

Journalists

Journalists, social media editors, citizens

Journalists, citizens

Journalists with cross-field backgrounds

Highly technical journalists and technologists

Technological actants

Customized content management systems

Social media platforms, commenting affordances

Blogging and microblogging platforms, smartphones

Open-source statistical analysis and data visualization software

Proprietary algorithms for natural language processing and generation

Audience approach

Passive audiences

Active participants

Mostly passive audiences

Mostly passive audiences but with interactive affordances

Passive audiences that may receive personalized content

Practices, norms, roles, and routines

Journalists in control and strive to adhere to values embedded in occupational ideology

Journalists in control but motivated to curate and invite collaboration at multiple stages of news production

Journalists in control and motivated by immediacy, but also engage in curation and invite some co-presence

Journalists in control but emphasis is on central tendencies, and the ideals of transparency and sharing

Humans delegate control to actants, with emphasis on increased production that appears human-made

Knowledge claims

Claims based on established authority as arbiters of truth in news

Claims reinforced by references to collaborative knowledge production

Claims diminished due to immediacy and challenges of real-time verification

Claims reinforced by references to authority of science and quantification

Claims reinforced by references to mechanical objectivity and impartiality

Forms of knowledge

News-as-items and news-about-relations

News-as-items with contributions from active participants

News-as-impressions that may eventually become news-as-items

News-as-items and news-about-relations

News-as-items and news-as-impressions

Narrative structure

Coherent and traditional structures, such as the inverted pyramid

Coherent and traditional structures, such as the inverted pyramid

Fragmented and usually following a reverse chronological order as its main organizational structure

Coherent and traditional structures, but with more interactive and modular elements

Coherent but highly structured and usually based on limited range of templates

Temporality

Ordered, interpretive framework shaped by eventization and elite voices

Ordered, interpretive framework featuring more diverse set of sources

Overlapping moments in time with an interpretive framework interspersing multiple voices

Ordered, interpretive framework relying on structured data sources

Ordered, systematically interpreted framework relying on semistructured documents and structured data sets

Authorial stance

Objective as a result of following a journalistic process

More subjective and informed by networked balance and co-presence

More subjective and informed by networked balance and co-presence

Objective, but implicitly conveyed as incomplete by virtue of exploratory visualizations

Objective as a result of its mechanical production

Status of text

Finished product

Finished product

Incomplete, temporary product that is being frequently updated

Finished product, or semi-finished as a result of automated updates

Finished product that may be dynamic as a result of personalization

Note : The epistemologies of journalism matrix outlines the most dominant news production patterns for each of 10 dimensions. In this table, it is applied to distinct forms of predominantly digital journalism, as per the authors’ knowledge of the sectors and existing scholarly work. Exceptions to the dominant patterns can exist in different geographical contexts and among different sorts of news publishers.

Participatory Journalism

Participatory journalism is known as a form of digital journalism that promotes active and intentional engagement between newsworkers and individuals previously thought of as mostly passive audiences ( Singer et al., 2011 ). Although journalism has long offered audiences an opportunity to have a voice, whether through purposive sourcing or dedicated sections for letters to the editor, this more recent form aims to center citizens’ contributions in multiple stages of the news production process ( Lawrence et al., 2018 ). It may manifest itself both in perception (beliefs about the role of audiences) and in practice (affordances and efforts to involve audiences) and entail direct, indirect, and sustained exchanges designed to empower audiences ( Coddington et al., 2018 ). As Westlund and Ekström (2018) argue, scholarship on participatory journalism must now consider both proprietary and nonproprietary platforms. Importantly, proprietary platforms are those that belong to and are controlled by a specific company (with the inner workings often black-boxed) and which may be used by others through the purchase of a license or their participation in a monetization scheme. Some news organizations are proprietors of platforms and algorithms of their own. However, news companies also rely on platforms (e.g., Facebook and Chartbeat) that are not proprietary to them. Such third-party platforms, which include the likes of Twitter ( Hermida et al., 2014 ) and WhatsApp ( Kligler-Vilenchik & Tenenboim, 2020 ), are now deeply embedded in journalistic practice. That, in turn, has structured the affordances, possibilities, and expectations for acts of participatory journalism. Publishers are increasingly focusing on reducing their dependency on third parties and developing their own proprietary solutions, both for economic purposes and to introduce new affordances for participation. Moreover, as scholars have observed, not all participation is prosocial; a considerable amount involves harassment, bullying, and hate speech ( Lewis et al., 2020 ; Quandt, 2018 ).

News can be produced via participatory journalism by an extensive range of individual social actors that is typically led by journalists, social media editors, and audience engagement editors but may involve a range of previously passive actors such as citizen journalists ( Wall, 2017 ). Its production processes are still human-centric, although they draw upon proprietary and third-party technological actants such as social media platforms to facilitate participation at different stages of news production ( Westlund & Ekström, 2018 ). The audiences are not only diverse but also active, as nearly any member is theoretically able to engage in participatory journalism due to the low barrier to entry ( Coddington et al., 2018 ).

Participatory journalism involves practices, norms, routines, and roles oriented toward curation and requiring an openness to collaboration that has historically been a source of professional tension ( Lewis, 2012 ). It is driven by a logic that may be normatively characterized as democratically oriented and critically characterized as communicative capitalism ( Vujnovic et al., 2010 ; see also Zamith, 2018 ). The knowledge claims made within participatory journalism differ from traditional claims in that they assert themselves to be enhanced by public engagement―they are presumed to be actively vetted and informed by others’ observed and lived experiences―and thus purport to represent a collaborative form of knowledge production ( Anderson & Revers, 2018 ). As a form of knowledge, participatory journalism may take different shapes but is most commonly seen as typical news-as-items, wherein participants inform but do not revolutionize traditional journalistic products ( Borger et al., 2019 ).

The narrative structure of the products of participatory journalism are typically coherent and adhere to traditional structures, like the inverted pyramid for texts ( Engelke, 2020 ). Regarding temporality, products tend to adhere to an interpretive framework that draws upon more diverse sets of sources in an ordered manner ( Borger et al., 2019 ). The authorial stance differs in that it is more subjective and involves weaker professional control resulting from efforts to promote networked balance and co-presence ( Lawrence et al., 2018 ). The status of the text is typically implicitly conveyed as static and presumed to be finished, lest it involve a live or rapidly evolving news event ( Ekström & Westlund, 2019b ).

More generally, the tension between professional control and open participation ( Lewis, 2012 ) is associated with the epistemological authority of journalists in producing and defining news. Studies find that journalists remain in control of those processes or cede only a portion of their control (e.g., Engelke, 2019 ). Although scholars continue to see potential for greater participation in the news, the degree of “dark participation” has proven to be a significant barrier ( Quandt, 2018 )―evidenced, for instance, by the removal of user-commenting affordances on many leading news websites. Nevertheless, participatory journalism has in some cases substantively reshaped sourcing practices, yielding less elite and more diverse source networks ( Hermida et al., 2014 ) and ultimately producing more cautious knowledge claims. Moreover, whereas citizens’ direct participation in news production may be more limited than some scholars envisioned at the turn of the century, their indirect participation―by privately sharing news materials on tracked social media platforms, posting firsthand videos through semi-public accounts, and publicly discussing news and news coverage―has further reshaped journalism beyond this specific form ( Engelke, 2019 ).

Live Blogging

Live blogging is a form of digital journalism that focuses on ongoing, near real-time reporting of both planned and unexpected news events through brief and sequential posts on digital websites and platforms ( Matheson & Wahl-Jorgensen, 2020 ). This approach to journalism, which major news organizations have employed as far back as the early 2000s ( Thurman & Walters, 2013 ), is routinely used to cover sporting events, political speeches, and breaking news such as terror attacks ( Thorsen & Jackson, 2018 ). Although it is sometimes considered to be a text-based parallel to live broadcast news, it differs in the extent to which it typically engages with audiences and how it conveys its narrative. It is closely associated, and thus frequently interchanged, with the notion of live tweeting.

News can be produced through live blogging by an extensive range of individual social actors that include both staff journalists and citizens acting as journalists ( Thurman & Rodgers, 2014 ). This is made possible through the use of technological actants that are often not proprietary to news organizations, such as content management systems and blogging platforms, as well as through social media platforms ( Thorsen, 2013 ). Live blogging may be performed as one-way communication with general (and specialized) audiences, but it sometimes includes affordances for audience engagement―as in directly soliciting and answering questions during unfolding events or incorporating contextual information provided by members of the audience ( Bennett, 2016 ).

Its practices, norms, routines, and roles are characterized primarily by speed and curation, as actors must not only observe real-time events and break them as news but also quickly make sense of those events in order to distinguish their products from competitors’ while incorporating content created by other members of a social network ( Thurman & Walters, 2013 ). As such, its practitioners seemingly are more cautious with their knowledge claims because they explicitly recognize that emerging events can be confusing, even when observed firsthand, and the immediacy of their posts makes fact-checking difficult, if not unfeasible. They may, however, draw on the public to verify information for knowledge claims, such as by asking other users to confirm the physical address where a news event is taking place. As a form of knowledge, it is typically composed of a series of individual products (i.e., bullets or tweets) best characterized as news-as-impression but that add up to (and can be consumed as) news-as-item once the event is over.

The narrative structure of live blogging is fragmented and not organized by textual coherence but, rather, by reverse chronology, with the latest observation usually on top ( Matheson & Wahl-Jorgensen, 2020 ). Its temporality is characterized by overlapping moments in time, as previously reported developments are contextualized while new developments are reported, and new voices are occasionally interspersed via affordances such as retweets ( Matheson & Wahl-Jorgensen, 2020 ). The authorial stance is marked by networked balance and co-presence, rather than objectivity, because authors typically adopt a mix of their observations and opinions while inviting and including discrete moments shared by fellow journalists, sources, and audience members ( Matheson & Wahl-Jorgensen, 2020 ). The status of the text is explicitly conveyed as dynamic and temporary, with an understanding that updates are often open, incomplete, and unfinished ( Matheson & Wahl-Jorgensen, 2020 ).

The consequence of these attributes is that live blogging is more willing to cede some of its epistemological authority in defining news in large part because of how it produces news. Although journalism is often described as “a first draft of history,” live blogging is more accommodating of partial accounts, forgiving of corrections, and willing to include unverified claims. In other words, it recognizes itself as being particularly temporary within the ecosystem of journalism—a moment in time that will be replaced by fuller accounts. Moreover, live blogging is more distanced from objectivity norms and open to audiences, making knowledge production about “news” a more distributed endeavor. As Matheson and Wahl-Jorgensen (2020 , p. 313) state, “the live blog can be understood as a journalistic response to the logics of social media”—although, it is contended in this article, to a lesser degree than participatory journalism.

Data Journalism

Data journalism may be conceptualized as a hybrid form that is grounded in “data analysis and the presentation of such analysis” ( Coddington, 2015 , p. 334). It may also be delineated by its content, which

has a central thesis (or purpose) that is primarily attributed to (or fleshed out by) quantified information (e.g., statistics or raw sensor data); involves at least some original data analysis by the item’s author(s); and includes a visual representation of data ( Zamith, 2019b , p. 478).

The form is not itself new—it is an outgrowth of a longer tradition of precision journalism and computer-assisted reporting ( Houston, 1996 ; Splendore, 2016 )—but data journalism distinguishes itself by decoupling from investigative journalism and calling greater attention to best practices in data sharing and visualization ( Cairo, 2019 ; Coddington, 2015 ). It has also developed during a period when journalists have greater access to digital data and accessible tools, is now produced by major news organizations, and now has its own award bodies ( Zamith, 2019b ). However, journalists do struggle to get access to worthwhile and reliable data in many geographical contexts ( Lewis & Nashmi, 2019 ; Porlezza & Splendore, 2019 ).

The social actors involved in the production of data journalism typically have backgrounds in statistics, computer science, design, and/or journalism—and a new professional class has emerged that reflects a “cross-field hybridity” ( Coddington, 2015 , p. 337) by incorporating multiple of those backgrounds ( Hermida & Young, 2017 ). It is defined in part by the technological actants that enable it, including statistical analysis software and data visualization tools, as well as the premium that is placed on open-source solutions ( Splendore, 2016 ). Its audiences are typically passive but may take an active role in shaping news production—high-profile data journalism projects have involved audience participation, although participatory affordances are typically limited ( Zamith, 2019b )—and are usually given opportunities to interact with content.

The practices, norms, routines, and roles of this form focus on central tendencies rather than outliers ( Young & Hermida, 2015 ) and emphasize the ideals of transparency and sharing ( Coddington, 2015 ), yet they still legitimate themselves through the lens of some key traditional journalism principles ( Borges-Rey, 2020 ). Its knowledge claims are rooted in science and quantification, and further benefit from a mythology around the objectivity of quantified claims ( Lewis & Westlund, 2015b ). Its forms of knowledge involve news-as-item for many of its “everyday” variants ( Zamith, 2019b ) as well as the deeper analyses better characterized as news-about-relations ( Young & Hermida, 2015 ).

The narrative structure of data journalism is ordered and, in many ways, adheres to traditional structures ( Borges-Rey, 2020 ), but it is more interactive and modular to accommodate visualizations, which are inherent to the storytelling ( Cairo, 2019 ; Young & Hermida, 2015 ). Regarding temporality, it follows an ordered, interpretive framework that incorporates human sources but is most dependent on data sources ( Porlezza & Splendore, 2019 ; Zamith, 2019b ). Its authorial stance is objective, with some recognition that the author’s account is incomplete and thus open to further interpretation via the interactive features of data visualizations. The status of text is typically presumed to be finished or semi-finished, although texts may include visuals, models, and modules that automatically update as new data are entered.

Data journalism ultimately redefines epistemological authority as the result of data and scientific analyses that are further illustrated through anecdotal lived experience ( Young & Hermida, 2015 ). Indeed, as Cairo (2019) contends, “Numbers and charts look and feel objective, precise, and, as a consequence, seductive and convincing” (p. xi). Data journalism has mainstreamed hypothesis-testing and data-driven logics within journalism, although epistemological tensions still emerge when traditional journalists work alongside their more data-oriented counterparts ( Borges-Rey, 2020 ). However, although the production of data journalism marks an epistemological shift from traditional journalism, it is not a break. As Borges-Rey (2020) notes, data journalists routinely oscillate between “newshound” and “techie” approaches to news production. Furthermore, data journalists often legitimize their work as news production by referencing journalistic ideals and adopting its language ( Coddington, 2015 ).

Automated Journalism

Automated journalism refers to “algorithmic processes that convert data into narrative news texts with limited to no human intervention beyond the initial programming” ( Carlson, 2015 , p. 417). It is a more advanced form of computational journalism ( Coddington, 2015 ) that uses algorithms to largely automate the collection, writing, publication, and/or the distribution of news ( Diakopoulos, 2019 ). Although machine-driven forms of journalism also trace many of their roots to precision journalism and computer-assisted reporting—and similarly require some form of data to be executed—they rapidly gained social capital within journalistic spaces starting in the mid-2000s ( Zamith, 2019a ). There are now companies such as Automated Insights that are advancing the technical capabilities and professional use of algorithms for automating news production, and they count major news organizations such as the Associated Press as their clients ( Carlson, 2018 ). Perhaps most important, automated journalism has changed the scale at which journalism can be produced ( Diakopoulos, 2019 ). It has also introduced new ways of communicating journalism, as with chatbots ( Jones & Jones, 2019 ).

The social actors involved in automated journalism are mostly highly technical and include technologically oriented journalists, computational linguists, and vendors of proprietary algorithms ( Carlson, 2018 ; Diakopoulos, 2019 ). Its technological actants include mostly proprietary algorithms for natural language processing and natural language generation that give humans some degree of structured control (e.g., creating templates) but aim to require minimal human involvement ( Dörr, 2016 ). The audiences in automated journalism generally remain passive, although content may be personalized based on predictions from historical data and their active choices ( Zamith, 2019a ). Those recommendation systems can be designed to fit commercial purposes as well as distinct democratic models ( Helberger, 2019 ).

The practices, norms, routines, and roles of automated journalism are oriented toward abstraction, structuration, quantification, and personalization, with the objective of simultaneously breaking news down to granular, discrete elements while using those elements to create news products that are indistinguishable from their human-generated counterparts ( Coddington, 2015 ; Graefe et al., 2018 ). Its knowledge claims are derived from mechanical analyses of data that give them “algorithmic authority” by virtue of their presumed impartiality―even as those algorithms are themselves biased by the humans who create them ( Carlson, 2015 ). Its forms of knowledge include both news-as-item (e.g., automated news stories) and bite-size “structured information” ( Splendore, 2016 , p. 349) that can be used to power news-as-impression (e.g., chatbots and automated notifications).

The narrative structure of automated journalism is highly structured―indeed, its most common products are based on templates―and may be both coherent (as in the case of news stories; see Diakopoulos, 2019 ) and fragmented (as in the case of chatbots; see Jones & Jones, 2019 ). Regarding temporality, it typically follows an ordered, systematically interpreted framework that draws chiefly upon semistructured documents and structured data sets ( Dörr, 2016 ). Its authorial stance is objective, again drawing upon the purported impartiality of the algorithms that produced the news ( Broussard, 2018 ; Carlson, 2018 ). The status of text is typically presumed to be finished, although there is greater presumption of dynamism in response to the automated personalization of those texts ( Zamith, 2019a ).

As Carlson (2018) argues, automated journalism “represents a core departure from how journalism has been understood and cannot be contained as an extension of journalism’s professional logic” (p. 1765). Under this form, human judgment should play a limited (or unchanging) role in the production of knowledge about the news; instead, production should be guided by abstracted principles and enacted by algorithms ( Coddington, 2015 ). Furthermore, it shifts the idea of news as public, shared knowledge toward individual, personalized knowledge ( Splendore, 2016 ). It thus challenges traditional notions of journalistic epistemology even as it arguably serves as the apotheosis of one its key production values: objectivity ( Carlson, 2018 ). However, although this stream of journalism emphasizes the technical by its very nature, scholars have argued that the technological actants and activities involved in this space remain deeply influenced by human actors ( Broussard, 2018 ; Diakopoulos, 2019 ). Consequently, and in large part due to the current state of technology, the epistemological break in contemporary practice is more limited than theory would suggest—and this phenomenon is unlikely to change in the near future.

Discussion and Research Directions

Until recently, scholars have studied and described news production as a set of human-oriented activities that largely share a universal set of characteristics (see review in Westlund & Ekström, 2019 ). The authors of this article have deliberately sought to do otherwise, and instead called attention to recent arguments underscoring the growing role of technological actants in journalism and the heterogeneous nature of news production―which, in turn, have implications for how people come to understand “news.” This position primarily draws on three streams of research: the epistemologies of journalism (e.g., Ekström & Westlund, 2019a , 2019b ), sociotechnical approaches to understanding news work (e.g., Lewis & Westlund, 2015a ; Zamith, 2019a ), and systematic comparisons of diverging news production processes (e.g., Matheson & Wahl-Jorgensen, 2020 ). The authors contribute to those streams by proposing the epistemologies of journalism matrix, which provides scholars with an analytic framework for examining the heterogeneity of news production in terms of its implicated entities, its cultures and methods, and its positionality in relation to matters of knowledge and authority.

The utility of the matrix is illustrated through an examination of four forms of journalism: participatory journalism, live blogging, data journalism, and automated journalism. The analysis highlights three points. First, contemporary news production is deeply influenced by myriad technological actants, which are reshaping how knowledge about current events is being created, evaluated, and disseminated. Second, professional journalists are losing epistemic authority over the news as key activities are delegated to algorithms created by non-journalists and to citizens who have become more present in news production. Third, the outputs of news production are becoming more diverse both in form and in content, further challenging long-standing norms about what is and is not “journalism.” However, those are but four forms and hardly capture all of what journalism encompasses. The authors thus invite scholars to expand on the matrix by applying it to other forms of journalism—and, in the process, refine the matrix itself and advance its theoretical implications.

In addition, the authors believe it is important for any scholar studying news production to be mindful of three key developments in their future work. First, it is apparent that what is “news” to different people is quite different today from times past. The history of journalism has been marked by many significant changes as to what is considered news, how it is shaped, and who distributes it. However, digital devices and platforms have made news available 24/7, and the ease of producing and disseminating content these days has contributed to an explosion of news produced by a large and diverse array of actors. Moreover, that news is increasingly sought on just a few platforms (e.g., Google and Facebook) that often flatten traditional media hierarchies by placing news produced by professional journalistic outlets alongside content created by nonprofessionals. The consequence is that there are now more interlopers seeking to pass their content off as “news”—from individual trolls seeking to get a rise out of people ( Quandt, 2018 ) to actors hoping to monetize their content ( Braun & Eklund, 2019 ) and states seeking to gain political advantage ( Marwick & Lewis, 2017 )—which has further complicated a historically contested term. Moreover, the past decade has been marked by low or declining levels of trust in news media in many areas of the world ( Fletcher & Park, 2017 ), as well as sustained attacks on news media ( Carlson et al., 2021 ; Waisbord, 2020 ).

Second, the heterogeneity of “news” and “news production” requires scholars to think carefully about how they operationalize those variables in their work ( Mast et al., 2017 ; Waisbord, 2018 ). For example, there is a substantive and growing body of literature on news consumption and news avoidance that builds on quantitative data and analyses of media effects ( Skovsgaard & Andersen, 2020 ). Such studies often conceptualize and operationalize news and news production processes in ways that make them appear more homogeneous than they are in practice ( Mast et al., 2017 ). As such, differences in research findings may be due, in part, to distinct understandings of those concepts, in light of their heterogeneity. It is imperative, therefore, for scholars to both examine the evolution of these understandings and account for them in research by either offering more granular options or detailing their operationalizations.

Third, the power dependencies in news production have changed markedly in recent years ( Ekström & Westlund, 2019b ). It is now much more difficult for practitioners to adhere to the values typically associated with their occupational ideology or to resist changes instituted by superiors and consolidating ownership ( Coddington, 2019 ; Vos & Heinderyckx, 2015 ). News producers, once seen as gatekeepers, are now themselves gatekept by algorithms employed by platform companies ( Gillespie, 2014 ; Wallace, 2018 )—algorithms that producers often believe they must adjust to even as they recognize such actions only make them more dependent ( Nielsen & Ganter, 2018 ; Pickard, 2020 ). Their future is sometimes tied to technologies developed far from newsrooms ( Braun & Eklund, 2019 ; Diakopoulos, 2019 ; Tandoc, 2019 ). Thus, contemporary analyses of news production should account for power differences among institutional actors—recognizing that journalistic actors are now less likely to exert dominance.

At the same time, although this article has focused on change and on digital journalism, it is important to recognize that a non-negligible amount of what is commonly referred to as “journalism” has remained reasonably stable—and that much of the change is rooted in pre-digital expectations, practices, and capabilities ( Zelizer, 2019 ). Moreover, this article has focused on the mainstream applications of journalism in Western contexts, and it is important to recognize that the histories and legacies of other places impact the developmental trajectories—and epistemological notions—of digital journalism differently in those contexts ( Mellado, 2021 ).

Nevertheless, history has shown that news production will continue to evolve alongside broader economic, political, professional, social, and technological shifts—and in doing so spring new forms and assemblages. An epistemological lens affords scholars a useful and adaptable approach for understanding the implications of those changes to the production of knowledge about news. Nevertheless, it is apparent that future scholarship will demand further theoretical and methodological development in order to keep up with a rapidly changing ecosystem and information regime.

Acknowledgments

The work of Oscar Westlund was supported by Riksbankens Jubileumsfond [grant number RJ P16-0715].

Further Reading

  • Ekström, M. , & Westlund, O. (2019). Epistemology and journalism . In Oxford research encyclopedia of communication . Oxford University Press.
  • Pressman, M. (2018). On press: The liberal values that shaped the news . Harvard University Press.
  • Ryfe, D. M. (2019). Journalism and the public . Polity.
  • Steensen, S. , & Westlund, O. (2021). What is digital journalism studies? Routledge.
  • Tandoc, E. C. (2019). Analyzing analytics: Disrupting journalism one click at a time . Routledge.
  • Usher, N. (2021). News for the rich, white, and blue: How place and power distort American journalism . Columbia University Press.
  • Ahva, L. , & Steensen, S. (2019). Journalism theory. In K. Wahl-Jorgensen & T. Hanitzsch (Eds.), The handbook of journalism studies (pp. 38–54). Routledge.
  • Ali, C. , Schmidt, T. R. , Radcliffe, D. , & Donald, R. (2019). The digital life of small market newspapers . Digital Journalism , 7 (7), 886–909.
  • Anderson, C. W. , & Revers, M. (2018). From counter-power to counter-pepe: The vagaries of participatory epistemology in a digital age . Media and Communication , 6 (4), 24.
  • Ataman, B. , & Çoban, B. (2018). Counter-surveillance and alternative new media in Turkey . Information, Communication and Society , 21 (7), 1014–1029.
  • Bakker, P. (2012). Aggregation, content farms, and Huffinization . Journalism Practice , 6 (5–6), 627–637.
  • Barnhurst, K. G. (2011). The problem of modern time in American journalism . KronoScope , 11 (1/2), 98–123.
  • Belair-Gagnon, V. , & Holton, A. E. (2018). Boundary work, interloper media, and analytics in newsrooms . Digital Journalism , 6 (4), 492–508.
  • Bennett, D. (2016). Sourcing the BBC’s live online coverage of terror attacks . Digital Journalism , 4 (7), 861–874.
  • Benson, R. (2006). News media as a “journalistic field”: What Bourdieu adds to new institutionalism, and vice versa . Political Communication , 23 (2), 187–202.
  • Borger, M. , van Hoof, A. , & Sanders, J. (2019). Exploring participatory journalistic content: Objectivity and diversity in five examples of participatory journalism . Journalism , 20 (3), 444–466.
  • Borges-Rey, E. (2020). Towards an epistemology of data journalism in the devolved nations of the United Kingdom: Changes and continuities in materiality, performativity and reflexivity . Journalism , 21 (7), 915–932.
  • Bourdieu, P. (1993). The field of cultural production: Essays on art and literature . Columbia University Press.
  • boyd, d. , & Ellison, N. B. (2007). Social network sites: Definition, history, and scholarship . Journal of Computer-Mediated Communication , 13 (1), 210–230.
  • Braun, J. A. (2015). This program is brought to you by . . . . Yale University Press.
  • Braun, J. A. (2019). News distribution . In Oxford research encyclopedia of communication . Oxford University Press.
  • Braun, J. A. , & Eklund, J. L. (2019). Fake news, real money: Ad tech platforms, profit-driven hoaxes, and the business of journalism . Digital Journalism , 7 , 1–21.
  • Broussard, M. (2018). Artificial unintelligence: How computers misunderstand the world . MIT Press.
  • Bruns, A. (2008). The active audience: Transforming journalism from gatekeeping to gatewatching. In C. A. Paterson & D. Domingo (Eds.), Making online news (pp. 171–184). Lang.
  • Bucy, E. P. (2004). Interactivity in society: Locating an elusive concept . The Information Society , 20 (5), 373–383.
  • Cairo, A. (2019). How charts lie: Getting smarter about visual information . Norton.
  • Carlson, M. (2015). The robotic reporter . Digital Journalism , 3 (3), 416–431.
  • Carlson, M. (2017). Journalistic authority: Legitimating news in the digital era . Columbia University Press.
  • Carlson, M. (2018). *Automating judgment? Algorithmic judgment, news knowledge, and journalistic professionalism[https://doi.org/10.1177/1461444817706684].* New Media & Society , 20 (5), 1755–1772.
  • Carlson, M. , Robinson, S. , & Lewis, S. C. (2021). *Digital press criticism: The symbolic dimensions of Donald Trump’s assault on U.S. journalists as the “enemy of the people”[https://doi.org/10.1080/21670811.2020.1836981].* Digital Journalism , 9 (6), 737–754.
  • Chua, S. , & Duffy, A. (2019). Friend, foe or frenemy? Traditional journalism actors’ changing attitudes towards peripheral players and their innovations . Media and Communication , 7 (4), 112–122.
  • Chua, S. , & Westlund, O. (2019). Audience-centric engagement, collaboration culture and platform counterbalancing: A longitudinal study of ongoing sensemaking of emerging technologies . Media and Communication , 7 (1), 153–165.
  • Coddington, M. (2015). Clarifying journalism’s quantitative turn . Digital Journalism , 3 (3), 331–348.
  • Coddington, M. (2019). Aggregating the news: Secondhand knowledge and the erosion of journalistic authority . Columbia University Press.
  • Coddington, M. , Lewis, S. C. , & Holton, A. E. (2018). Measuring and evaluating reciprocal journalism as a concept . Journalism Practice , 12 (8), 1039–1050.
  • Compton, J. R. , & Benedetti, P. (2010). Labour, new media and the institutional restructuring of journalism . Journalism Studies , 11 (4), 487–499.
  • Cook, T. E. (1998). Governing with the news: The news media as a political institution . University of Chicago Press.
  • Costera Meijer, I. , & Groot Kormelink, T. (2015). Checking, sharing, clicking and linking . Digital Journalism , 3 (5), 664–679.
  • Diakopoulos, N. (2019). Automating the news: How algorithms are rewriting the media . Harvard University Press.
  • Domingo, D. , Quandt, T. , Heinonen, A. , Paulussen, S. , Singer, J. B. , & Vujnovic, M. (2008). Participatory journalism practices in the media and beyond . Journalism Practice , 2 (3), 326–342.
  • Dörr, K. N. (2016). Mapping the field of algorithmic journalism . Digital Journalism , 4 (6), 700–722.
  • Duffy, A. , Ling, R. , Kim, N. , Tandoc, E. , & Westlund, O. (2020). News: Mobiles, mobilities and their meeting points . Digital Journalism , 8 (1), 1–14.
  • Ekström, M. (2002). Epistemologies of TV journalism: A theoretical framework . Journalism , 3 (3), 259–282.
  • Ekström, M. , Ramsälv, A. , & Westlund, O. (2021). The epistemologies of breaking news . Journalism Studies , 22 (2), 174–192.
  • Ekström, M. , & Westlund, O. (2019a). Epistemology and journalism . In Oxford research encyclopedia of communication . Oxford University Press.
  • Ekström, M. , & Westlund, O. (2019b). The dislocation of news journalism: A conceptual framework for the study of epistemologies of digital journalism . Media and Communication , 7 (1), 259–270.
  • Elliott, A. , & Urry, J. (2010). Mobile lives . Routledge.
  • Engelke, K. M. (2019). Online participatory journalism: A systematic literature review . Media and Communication , 7 (4), 31–44.
  • Engelke, K. M. (2020). Enriching the conversation: Audience perspectives on the deliberative nature and potential of user comments for news media . Digital Journalism , 8 (4), 447–466.
  • Fenton, N. (2011). Deregulation or democracy? New media, news, neoliberalism and the public interest . Continuum , 25 (1), 63–72.
  • Figenschou, T. U. , & Ihlebæk, K. A. (2019). Challenging journalistic authority . Journalism Studies , 20 (9), 1221–1237.
  • Fletcher, R. , & Park, S. (2017). The impact of trust in the news media on online news consumption and participation . Digital Journalism , 5 (10), 1281–1299.
  • Flew, T. , & Swift, A. (2013). Regulating journalists? The Finkelstein review, the convergence review and news media regulation in Australia . Journal of Applied Journalism & Media Studies , 2 (1), 181–199.
  • Gieryn, T. F. (1999). Cultural boundaries of science: Credibility on the line . University of Chicago Press.
  • Gillespie, T. (2014). The relevance of algorithms. In T. Gillespie , P. Boczkowski , & K. Foot (Eds.), Media technologies: Essays on communication, materiality, and society (pp. 167–194). MIT Press.
  • Graefe, A. , Haim, M. , Haarmann, B. , & Brosius, H.‑B. (2018). Readers’ perception of computer-generated news: Credibility, expertise, and readability . Journalism , 19 (5), 595–610.
  • Gutsche, R. E., Jr. , & Hess, K. (2020). Placeification: The transformation of digital news spaces into “places” of meaning . Digital Journalism , 8 (5), 586–595.
  • Hågvar, Y. B. (2019). News media’s rhetoric on Facebook . Journalism Practice , 13 (7), 853–872.
  • Hamilton, J. (2004). All the news that’s fit to sell: How the market transforms information into news . Princeton University Press.
  • Hanusch, F. , & Maares, P. (2021). News production. In K. B. Jensen (Ed.), A handbook of media and communication research (3rd ed., pp. 93–111). Routledge.
  • Heft, A. , & Dogruel, L. (2019). Searching for autonomy in digital news entrepreneurism projects . Digital Journalism , 7 (5), 678–697.
  • Helberger, N. (2019). On the democratic role of news recommenders . Digital Journalism , 7 (8), 993–1012.
  • Hermida, A. (2020). Post-publication gatekeeping: The interplay of publics, platforms, paraphernalia, and practices in the circulation of news . Journalism & Mass Communication Quarterly , 97 (2), 469–491.
  • Hermida, A. , Lewis, S. C. , & Zamith, R. (2014). Sourcing the Arab spring: A case study of Andy Carvin’s sources on Twitter during the Tunisian and Egyptian revolutions . Journal of Computer-Mediated Communication , 19 (3), 479–499.
  • Hermida, A. , & Young, M. L. (2017). Finding the data unicorn . Digital Journalism , 5 (2), 159–176.
  • Holt, K. , Ustad Figenschou, T. , & Frischlich, L. (2019). *Key dimensions of alternative news media[https://doi.org/10.1080/21670811.2019.1625715]. Digital Journalism , 7 (7), 860–869.
  • Houston, B. (1996). Computer-assisted reporting: A practical guide . St. Martin’s.
  • Jack, C. (2017). Lexicon of lies . Data & Society.
  • John, R. R. (1995). Spreading the news: The American postal system from Franklin to Morse . Harvard University Press.
  • Jones, B. , & Jones, R. (2019). Public service chatbots: Automating conversation with BBC news . Digital Journalism , 7 (8), 1032–1053.
  • Jönsson, A. M. , & Örnebring, H. (2010). User-generated content and the news: Empowerment of citizens or interactive illusion? Journalism Practice , 5 (2), 127–144.
  • Kim, Y. , & Lowrey, W. (2015). Who are citizen journalists in the social media environment? Digital Journalism , 3 (2), 298–314.
  • Kligler-Vilenchik, N. , & Tenenboim, O. (2020). Sustained journalist–audience reciprocity in a meso news-space: The case of a journalistic WhatsApp group . New Media & Society , 22 (2), 264–282.
  • Kreiss, D. , & Ananny, M. (2013). Responsibilities of the state: Rethinking the case and possibilities for public support of journalism . First Monday , 18 (4).
  • Lashmar, P. (2017). No more sources? Journalism Practice , 11 (6), 665–688.
  • Lawrence, R. G. , Radcliffe, D. , & Schmidt, T. R. (2018). Practicing engagement . Journalism Practice , 12 (10), 1220–1240.
  • Lewis, N. P. , & Nashmi, E. A. (2019). Data journalism in the Arab region: Role conflict exposed . Digital Journalism , 7 (9), 1200–1214.
  • Lewis, S. C. (2012). The tension between professional control and open participation . Information, Communication and Society , 15 (6), 836–866.
  • Lewis, S. C. , & Westlund, O. (2015a). Actors, actants, audiences, and activities in cross-media news work . Digital Journalism , 3 (1), 19–37.
  • Lewis, S. C. , & Westlund, O. (2015b). Big data and journalism . Digital Journalism , 3 (3), 447–466.
  • Lewis, S. C. , Zamith, R. , & Coddington, M. (2020). Online harassment and its implications for the journalist–audience relationship . Digital Journalism , 8 (8), 1047–1067.
  • Lowrey, W. (2011). Institutionalism, news organizations and innovation . Journalism Studies , 12 (1), 64–79.
  • Marwick, A. , & Lewis, R. (2017). Manipulation and disinformation online . Data & Society.
  • Mast, J. , Coesemans, R. , & Temmerman, M. (2017). Hybridity and the news: Blending genres and interaction patterns in new forms of journalism . Journalism , 18 (1), 3–10.
  • Matheson, D. , & Wahl-Jorgensen, K. (2020). The epistemology of live blogging . New Media & Society , 22 (2), 300–316.
  • Mellado, C. (2021). Beyond journalistic norms: Role performance and news in comparative perspective . Routledge.
  • Murschetz, P. C. (2020). State aid for independent news journalism in the public interest? A critical debate of government funding models and principles, the market failure paradigm, and policy efficacy . Digital Journalism , 8 (6), 720–739.
  • Napoli, P. M. (2011). Audience evolution: New technologies and the transformation of media audiences . Columbia University Press.
  • Neuman, W. R. (1991). The future of the mass audience . Cambridge University Press.
  • Nielsen, R. K. (2012). How newspapers began to blog . Information, Communication and Society , 15 (6), 959–978.
  • Nielsen, R. K. (2017). Digital news as forms of knowledge: A new chapter in the sociology of knowledge. In P. Boczkowski & C. W. Anderson (Eds.), Remaking the news: Essays on the future of journalism scholarship in the digital age (pp. 1–27). MIT Press.
  • Nielsen, R. K. , & Ganter, S. A. (2018). Dealing with digital intermediaries: A case study of the relations between publishers and platforms . New Media & Society , 20 (4), 1600–1617.
  • Örnebring, H. (2018). Journalists thinking about precarity: Making sense of the “new normal.” International Symposium on Online Journalism , 8 (1), 109–126.
  • Pavlik, J. V. (2004). A sea-change in journalism: Convergence, journalists, their audiences and sources . Convergence , 10 (4), 21–29.
  • Pickard, V. (2020). *Restructuring democratic infrastructures: A policy approach to the journalism crisis[https://doi.org/10.1080/21670811.2020.1733433]. Digital Journalism , 8 (6), 704–719.
  • Porlezza, C. , & Splendore, S. (2019). From open journalism to closed data: Data journalism in Italy . Digital Journalism , 7 (9), 1230–1252.
  • Quandt, T. (2018). Dark participation . Media and Communication , 6 (4), 36.
  • Radcliffe, D. , & Ali, C. (2017). Small-market newspapers in the digital age . Tow Center for Digital Journalism.
  • Schmitz Weiss, A. (2015). Place-based knowledge in the twenty-first century . Digital Journalism , 3 (1), 116–131.
  • Singer, J. B. (2012). The ethics of social journalism. Australian Journalism Review , 34 (1), 3–16.
  • Singer, J. B. , Domingo, D. , Heinonen, A. , Hermida, A. , Paulussen, S. , Quandt, T. , Reich, Z. , & Vujnovic, M. (2011). Participatory journalism: Guarding open gates at online newspapers . Wiley-Blackwell.
  • Sjøvaag, H. , Stavelin, E. , Karlsson, M. , & Kammer, A. (2019). The hyperlinked Scandinavian news ecology . Digital Journalism , 7 (4), 507–531.
  • Skovsgaard, M. , & Andersen, K. (2020). Conceptualizing news avoidance: Towards a shared understanding of different causes and potential solutions . Journalism Studies , 21 (4), 459–476.
  • Splendore, S. (2016). Quantitatively oriented forms of journalism and their epistemology . Sociology Compass , 10 (5), 343–352.
  • Stahel, L. , & Schoen, C. (2020). Female journalists under attack? Explaining gender differences in reactions to audiences’ attacks . New Media & Society , 22 (10), 1849–1867.
  • Steensen, S. , & Eide, T. (2019). News flows, inter-media connectivity and societal resilience in times of crisis . Digital Journalism , 7 (7), 932–951.
  • Swart, J. , Peters, C. , & Broersma, M. (2019). Sharing and discussing news in private social media groups . Digital Journalism , 7 (2), 187–205.
  • Thorsen, E. (2013). Live blogging and social media curation: Challenges and opportunities for journalism. In K. Fowler-Watt & S. Allan (Eds.), Journalism: New challenges (pp. 123–145). Centre for Journalism & Communication Research.
  • Thorsen, E. , & Jackson, D. (2018). Seven characteristics defining online news formats . Digital Journalism , 6 (7), 847–868.
  • Thurman, N. , & Rodgers, J. (2014). Citizen journalism in real time: Live blogging and crisis events. In E. Thorsen & S. Allan (Eds.), Citizen journalism: Global perspectives (pp. 81–95). Lang.
  • Thurman, N. , & Walters, A. (2013). Live blogging: Digital journalism’s pivotal platform? Digital Journalism , 1 (1), 82–101.
  • Usher, N. (2014). Making news at The New York Times . University of Michigan Press.
  • Usher, N. (2019). Putting “place” in the center of journalism research: A way forward to understand challenges to trust and knowledge in news . Journalism & Communication Monographs , 21 (2), 84–146.
  • Vos, T. , & Heinderyckx, F. (2015). Gatekeeping in transition . Routledge.
  • Vujnovic, M. , Singer, J. B. , Paulussen, S. , Heinonen, A. , Reich, Z. , Quandt, T. , Hermida, A. , & Domingo, D. (2010). Exploring the political–economic factors of participatory journalism . Journalism Practice , 4 (3), 285–296.
  • Waisbord, S. (2018). Truth is what happens to news . Journalism Studies , 19 (13), 1866–1878.
  • Waisbord, S. (2020). Mob censorship: Online harassment of US journalists in times of digital hate and populism . Digital Journalism , 8 (8), 1030–1046.
  • Wall, M. (2017). Mapping citizen and participatory journalism . Journalism Practice , 11 (2–3), 134–141.
  • Wallace, J. (2018). Modelling contemporary gatekeeping . Digital Journalism , 6 (3), 274–293.
  • Weischenberg, S. , & Matuschek, C. (2008). News production and technology . In W. Donsbach (Ed.), The international encyclopedia of communication . Wiley.
  • Westlund, O. (2013). Mobile news: A review and model of journalism in an age of mobile media . Digital Journalism , 1 (1), 6–26.
  • Westlund, O. , & Ekström, M. (2018). News and participation through and beyond proprietary platforms in an age of social media . Media and Communication , 6 (4), 1.
  • Westlund, O. , & Ekström, M. (2019). News organizations and routines. In K. Wahl-Jorgensen & T. Hanitzsch (Eds.), The handbook of journalism studies (2nd ed., pp. 73–88). Routledge.
  • Westlund, O. , & Quinn, S. (2018). Mobile journalism and mojos . In Oxford research encyclopedia of communication . Oxford University Press.
  • White, D. M. (1950). The “gate keeper”: A case study in the selection of news. The Journalism Quarterly , 27 (4), 383–391.
  • Wu, S. , Tandoc, E. C. , & Salmon, C. T. (2019). A field analysis of journalism in the automation age: Understanding journalistic transformations and struggles through structure and agency . Digital Journalism , 7 (4), 428–446.
  • Xu, D. (2015). Online censorship and journalists’ tactics . Journalism Practice , 9 (5), 704–720.
  • Young, M. L. , & Hermida, A. (2015). From Mr. and Mrs. Outlier to central tendencies . Digital Journalism , 3 (3), 381–397.
  • Zamith, R. (2018). Quantified audiences in news production . Digital Journalism , 6 (4), 418–435.
  • Zamith, R. (2019a). Algorithms and journalism . In Oxford research encyclopedia of communication . Oxford University Press.
  • Zamith, R. (2019b). Transparency, interactivity, diversity, and information provenance in everyday data journalism . Digital Journalism , 7 (4), 470–489.
  • Zelizer, B. (1993). Journalists as interpretive communities . Critical Studies in Mass Communication , 10 (3), 219–237.
  • Zelizer, B. (2019). Why journalism is about more than digital technology . Digital Journalism , 7 (3), 343–350.

Related Articles

  • Algorithms and Journalism
  • Mobile Applications and Journalistic Work
  • Epistemology and Journalism

Printed from Oxford Research Encyclopedias, Communication. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 01 July 2024

  • Cookie Policy
  • Privacy Policy
  • Legal Notice
  • Accessibility
  • [185.39.149.46]
  • 185.39.149.46

Character limit 500 /500

Write a response

A data journalist's guide to building a hypothesis, how an intentional exploration of inequity can help data journalists better serve communities, 04 june 2021, by eva constantaras , anastasia valeeva.

Our next Conversations with Data podcast will take place on Tuesday 6 July at 3 pm CEST / 9 am ET with Eva Constantaras from Internews and Anastasia Valeeva from the American University of Central Asia, Kyrgyzstan. During our live Q&A, they'll discuss the power of building a hypothesis for data journalism and what can be done to address inequity with data. The conversation will be our second live event on our Discord Server . Share your questions with us live and be part of our Conversations with Data podcast . Add to LinkedIn or your Google calendar now.

Introduction

2020 pulled data journalism in two drastically different directions. On the one hand, the Black Lives Matter movement forced the data journalism community to question equity in the field: who is data journalism produced by, for and about? On the other hand, the pandemic offered a plethora of opportunities to channel the firehouse of coronavirus into shiny, often impersonal, dashboards of despair and death that quantified the scale of the pandemic.

The best data-led pieces of the year married these two trends into powerful investigations into the pervasive inequities laid bare by the pandemic, transforming statistics into concrete examples of specific harm to people that could be mitigated if addressed. One word describes these outstanding investigations: intentional.

The stakes for data journalism in the face of media polarisation, misinformation and disinformation are high as it struggles to find a role in the efforts to rebuild a healthy information ecosystem for citizens. As Lisa Charlotte Rost of Datawrapper asks in her blog post Less News, More Context , "With which information can my audience navigate this world better?”

Almost 10 years of teaching data journalism has taught us that the journalists who produce the most powerful investigations are the ones who started with a powerful idea, a powerful idea formulated as a hypothesis. This method, Story Based Inquiry , pioneered by Mark Lee Hunter , has been adopted by many data journalists and refined further for data projects, for example, The Markup Method . For us, it enables journalists around the world to harness data to explore and explain the drivers of inequality undergirding the news of the day.

One hypothesis -- many stories

After reviewing dozens of nearly identical coronavirus dashboards, we ran across a submission for the 2020 Sigma Awards that suggested the journalists had dug into the data knowing what they were looking for. The entry on the disproportionate number of deaths among Black Brazilians, by Publica, a non-profit investigative outlet, led us to more stories published by Publica on racial disparities in vaccine distribution and access to ICU beds among indigenous communities.

Screenshot 2021 06 03 at 22 39 31 squashed

Though the data behind the stories was available to readers, the focus was the story, not the data. They have built a data journalism beat around disparities in healthcare access and a hypothesis-based approach allows them to drill deeper and deeper. They began with something like “Black Brazilians, who already scored low on an overall development index, are dying at faster rates than the general population” and then set out to see whether the hypothesis was true or not. Related stories refined this hypothesis to probe related disparities in healthcare equity during the pandemic. The rest of this story explores how to apply this approach yourself.

Screenshot 2021 06 03 at 22 41 58 squashed

Formulating a hypothesis

Let’s read a couple of stories and formulate their hypothesis as a statement.

Screenshot 2021 06 03 at 22 43 12 squashed

Perhaps you’ve come up with something like ‘Vaccine distribution is unequal’ or something more specific like ‘Vaccines are more available for high income countries in general, and, on an individual level, for wealthy people not the poor’.

They are both right. However, to be able to use a hypothesis as a tool for your own story, the second one works better. It formulates not only the idea, but also the means of proving it. This method is borrowed from social science, like a lot of data journalism techniques.

You don’t have to show this hypothesis-as-a-tool to your reader, but you do show it to your editor: it’s basically the pitch of the article. And since we want it to be convincing, it needs to be even more specific. What are the exact indicators that you will use to answer your questions? What is the unit of measurement? What time or geographic span are we looking at, and at what level of granularity? This is called the operationalisation process.

Let’s look at another story and formulate its hypothesis as a statement that is quite specific about the indicators.

Screenshot 2021 06 03 at 22 44 52 squashed

You may have spotted that the text itself has both general idea (“the drop in employment is not gender-neutral”) and more specific statements which prove this idea, like this one: “The sectors most affected in the pandemic crisis--restaurants, retail, beauty, tourism, education, domestic work, and care work for the young and elderly -- have high female employment”.

Let’s write out the basic requirements for a viable hypothesis using a sample hypothesis: “Socio-economically marginalised groups are more likely to die of the coronavirus”.

  • Can either be proven or disproven with data. For example, ‘Poor people are more likely to die of coronavirus than rich people.’
  • Is specific about what is being measured. ‘Citizens living in areas of the city with a lower annual income according to the latest census are dying at a higher rate than those living in richer neighbourhoods.’
  • The data is available. ‘Coronavirus death records and income data are available by neighbourhood.’
  • The topic is important to the public. ‘Inequity in healthcare access resonates universally.’

How to avoid common pitfalls

Now, let’s look at the common mistakes for hypotheses and how we can avoid them.

  • One half or both halves of the hypothesis cannot be proven with data. In many countries, neither specific geo-located data nor geo-located income data is available. For example, in the case of Brazil , only race data was available, so the hypothesis had to focus on race by geographical area, not income.

Screenshot 2021 06 03 at 22 46 54 squashed

  • The hypothesis is too fuzzy. The idea for a data story can often start from a broad, general idea like: ‘As the pandemic deepens, most EU countries become more pessimistic’. To make it work (and for anybody to care), you need to explain to yourself and to your audience what you mean exactly, how you will measure it and why it matters. In this Reuters story , the hypothesis may have been something like “Swedes and their pandemic policies were optimistic and open and they escaped the economic downturn that has spread across Europe”. Note, the story walks a fine line, presenting various correlations between attitude and economic indicators without making a causal claim.
  • There is no data. Too many ideas for data stories die young because it turns out there is no data to prove them. A lot of great data stories have emerged from journalists being resourceful with the data they do have, making the data gap the story or creating their own data. For example, this is how journalists around the world , in India and in Kyrgyzstan tackled the global undercount in COVID-19 deaths by building hypotheses around data quality issues.
  • The hypothesis is too broad. The topic is better for a book than a single story. Often, journalists try to tackle far too much in one story. It would take enormous time to explore all the variables that might influence the general problem. So why not focus on a specific aspect of your problem and explain it from A to Z? Instead of having a huge covid data dashboard with lots of demographic data but no stories, drill down and identify specific, compelling stories that justify having a database. For example, in our India job loss example, the journalist has a hypothesis focussed on job loss related to the sectors where women are employed. This story pursues a related but distinct hypothesis: care work during the pandemic is forcing women out of the workforce.

Screenshot 2021 06 03 at 22 48 41 squashed

Both of these reveal specific insights into barriers to economic recovery faced by women without getting lost in obvious generalisations about gender inequality.

  • The hypothesis is too narrow: it only measures how one factor influences a trend and discounts other data sources that might also contribute to it. Here is an example of how Rappler in the Philippines has dealt with the difficulty of identifying a pattern in the surge of coronavirus cases. While they start with a hypothesis about spikes in busy commercial areas, they also address the possible influence of factors such as concentration of violation of health and safety protocol.

Screenshot 2021 06 03 at 22 49 50 squashed

  • The hypothesis has already been proven true and is common knowledge.

A lot of data journalists around the world have shied away from “the procurement process is corrupt” stories because of course it is! Instead, they use very narrow examples to pursue accountability on a local level. Pajwok Afghan News’ data team pursued a hypothesis related to procurement price infliction of specific medical supplies. Dataphyte in Nigeria so aggressively pursued individual contracting irregularities that they forced the government to divulge more contract details .

The good news is that you can almost always make a weak hypothesis stronger by doing the research needed to make it more verifiable, specific, interesting and concise. Another piece of good news is that even if you prove your hypothesis false, what you did find is probably still a compelling, and maybe even a more surprising, story.

From hypothesis to questions

And now let’s dive a little deeper. The hypothesis-driven approach also lends itself well to developing research questions to prove your hypothesis true or false. Sticking with research questions that probe your hypothesis serve the same purpose as writing out interview questions for a difficult source ahead of time: it allows you to organise your thoughts and ensure you get the answers you need.

Let’s read this data story and pull out the major findings. Then we will reverse engineer the hypothesis and questions:

Screenshot 2021 06 03 at 22 51 26 squashed

If we list the data arguments in this piece, we can get something like this:

  • The majority of Indigenous Lands (TIs) in the Amazon have been identified as in critical condition due to the coronavirus pandemic in Brazil.
  • Of 1,228 Brazilian municipalities where there is at least a stretch of TIs, only 108 have an ICU bed, so less than 10% of Brazilian municipalities with indigenous lands have ICU beds.
  • More than 80% of all TI lands in the country are concentrated in the North, precisely the region that, along with the Northeast, has the largest ICU deserts in the country.
  • The maternal mortality rate for indigenous people is highest among all races, even when controlling the socioeconomic level. The deaths among those in the indigenous community are undercounted.
  • Among the 10 regionals that have been identified as most vulnerable to the coronavirus, seven haven’t been officially recognised for protected indigenous status.
  • About four out of five households in indigenous territories did not have a water supply and a third of households on indigenous lands did not have a bathroom for exclusive use.
  • In 17 TIs, at least one-fifth of the population was over 50 years of age, which is considered a risk factor for coronavirus.
  • Researchers have called for the establishment of specific strategies for the care of indigenous peoples.
  • Another recommended solution is the construction of field hospitals exclusively for indigenous people.

Screenshot 2021 06 03 at 22 52 41 squashed

From this list of answers, we can reverse engineer a hypothesis and a list of questions:

  • Indigenous communities are facing an acute health crisis during the pandemic due to under-resourced health facilities and underlying health conditions.

Are indigenous communities dying at a disproportionately high rate?

Do indigenous communities have worse access to ICU beds than the rest of the country?

What proportion of indigenous lands are considered in critical condition now?

Are indigenous communities considered to be in a more critical condition during the pandemic than the rest of the country?

What proportion of the population of indigenous communities is considered high risk?

How did maternal mortality rates of indigenous people compare to the general population before the pandemic?

How did access to clean water in indigenous communities compare to the rest of the population before the pandemic?

How complete are death records considered in indigenous territories compared to the rest of the country?

How complete are death records among indigenous communities?

How complete is the registration of Indigenous Territories?

  • What strategy can be employed to close the gap in access to healthcare and mitigate the vulnerability of indigenous people?

We can see these questions touch on different parts of the problem. While some describe the scale of the problem, others focus on the impact of the problem on a particular group of people, and others dive into the causes and factors behind that. Finally, there are questions about the possible solutions or ways to mitigate these consequences.

You can apply this general list of questions nearly to every data story that dives into the roots of the problem and aims to build a concise narrative around it:

  • How big is the problem?
  • Is it getting worse or better?
  • Which category of people is more likely to experience the consequences of the problem/benefit from the situation?
  • How does the problem affect this group of people?
  • What are the main causes explaining why the problem is disproportionately affecting these people?
  • Which factors have contributed to this?
  • What needs to be fixed for the impacted group of people to mitigate the consequences or solve the problem for them?
  • How much would it cost and is there a source of money for this?
  • Has anybody already tried to solve this problem, here or elsewhere?
  • How can we measure the effectiveness?

These questions help the story remain focused on the specific hypothesis that the journalists have set out to prove or disprove. The questions ensure they drill deep into the issue and explain the problem from various angles using data. A great data hypothesis consists of questions that can be answered with data to prove or disprove it.

In conclusion, a good hypothesis can be proven with the data that exists and generated new insights into an issue. It also measures the problem, causes, impact and solutions.

A hypothesis is a great way to build up beat reporting around an issue your audience cares about. For example, check out these variations of the previous hypothesis:

  • Indigenous communities are facing an acute economic crisis during the pandemic due to under-resourced economic recovery programmes and chronic lack of local investment.
  • Indigenous communities are facing an acute education crisis during the pandemic due to an under-resourced education system and chronic lack of access to the internet.

Many favourite issues covered by data journalists: politics, healthcare, education, the economy, are universal. Reading how other data journalists explore and explain these issues is a way to find inspiration to generate meaningful stories about and for your community and help communities make sense of pressing issues like inequity. Adopting a hypothesis-driven methodology established a workflow to build data-driven beat reporting around complex, often misunderstood problems that are not going away anytime soon and require meaningful and informed citizen engagement to change the status quo.

AYD 6356a Ks

Eva Constantaras is a data journalist specialised in building data journalism teams in the Global South. These teams have reported from across Asia, the Middle East, Latin America and Africa on topics ranging from broken foreign aid and food insecurity to extractive industries and public health. As a Google Data Journalism Scholar and a Fulbright Fellow, she developed a pedagogical approach and manual for teaching investigative and data journalism in high-risk environments. Follow her on Twitter: @evaconstantaras

Screenshot 2021 06 04 at 00 12 38 squashed

Anastasia Valeeva is a data journalism trainer and open data researcher. She has taught data journalism in Europe, the Balkans, Central Asia and Russia and is currently a data journalism lecturer at the American University of Central Asia, Kyrgyzstan . She is also a co-founder of School of Data Kyrgyzstan . She has researched the use of open data in investigative journalism as part of her fellowship at the Reuters Institute for the Study of Journalism, Oxford . Follow her on Twitter: @anastasiajourno

Screenshot 2021 06 28 at 21 48 55 squashed

Additional reading:

  • Data visualisation by hand: drawing data for your next story
  • The promise of WikiData as a data source for journalists
  • Making numbers louder: telling stories with sound
  • Conflict reporting with data
  • Harnessing Wikipedia's superpowers for journalists

A data journalist's guide to building a hypothesis - How an intentional exploration of inequity can help data journalists better serve communities

Time to have your say, sign up for our conversations with data newsletter.

Join 10.000 data journalism enthusiasts and receive a bi-weekly newsletter or access our newsletter archive here.

Almost there...

Review your cookie settings for the optimal site experience..

Social Features : allows us to show embedded Tweets Usage Insights : helps us improve the website

Al Jazeera Media Institute

2024 Al Jazeera Media Network. All rights reserved.

Al Jazeera Journalism Review

Outside image

Investigative journalism: How to develop and manage your sources

hypothesis journalism

Your sources are the backbone of any investigation. In Part 3 of our series on investigative journalism, we look at how to find, foster and manage them

Once an investigative journalist has decided on the questions they wish to find answers to in the course of an investigation, they next need to find the sources who will help them to find those answers. 

A journalist’s ability to build a network of digital, material and human sources for each question will depend on their experience and how deeply they have researched the subject of the investigation. 

Reliable sources are the backbone of any investigation. They are the gateway to facts, to background and context. They feed into one another, lending one another credence. A good journalist makes their sources work for one another. The number of sources is proportional to the strength of an investigation - that is, the more sources an investigation cites, the stronger it is. Diverse and numerous sources make an investigation more balanced and objective. 

Open-source material is a key resource for investigative journalists. Today there is much information that is no longer secret or exclusive to government bodies. It appears in government websites, in annual reports, in official gazettes, news bulletins, commercial registries, stock markets, tendering procedures, and daily and weekly newspapers. The same applies to public libraries and reports from civil society and international organisations. There are many 

A good journalist begins their research with open source materials. They gather, analyse and assess information, perhaps creating a special database to facilitate new discoveries and conclusions that they can then try and corroborate using their own network or any of the other sources available to them.

IJ9

Paper sources 

Documents can provide an initial source that is entirely reliable so long as they are not forged. They also cannot be modified or refuted in court if the journalist is sued, unlike human sources, whose narrative may change. Paper sources help to expand the scope of the investigation and may lead to other primary and secondary sources. But they may also be protected by privacy laws. Journalists should refer to the law of the country in which they are active and seek expert legal advice in this regard. 

One of the difficulties that journalists face here is the difficulty of accessing paper sources. This requires open source research skills (and this includes subscription databases) and may also involve convincing human sources to help you get your hands on them. It may also involve use of freedom of information laws. Paper sources sometimes require expert analysis and explanation (budgetary or accounting documents, for example, or court judgements). Some may need translation if the journalist is not good at the language they are written in. For this you need a human source who you can trust to help you out without breaching confidentiality.

IJ10

Human sources

These can be both primary and secondary. 

Primary sources are people directly connected with the event that a journalist is investigating - victims, eyewitnesses, people responsible, people who intervened or participated, etc. In an attempted murder, for example, the primary sources would include the injured party, eyewitnesses who saw the shooting and the perpetrator opening fire, the driver who brought the would-be killer to the scene, his partner in crime who planned the shooting, the seller who provided the weapon, the person who sheltered the perpetrator after the crime, the doctor who produced the medical report about the incident, the detectives investigating and the prosecutor who inspected the crime scene. These all have a direct relationship with the event and are thus primary sources in the investigation. 

It would be inappropriate for the journalist to listen to testimony from sources who were not present - who had heard about it on the radio, for example, or from someone else, who heard it from someone else, who heard it from elsewhere. 

All of the primary sources listed in the shooting case above will have different stories, despite being closest to the event. Eyewitnesses will have seen the incident from different angles - and of course, every source has their own biases, their own personal, psychological and social makeup that affects their view of events. Some might lie, give unsubstantiated suggestions, exaggerate, miss out things that they do not like or conflate facts, make conjectures and assumptions. Journalists should never be afraid to ask: How do you know? Examining information carefully is our first responsibility.

The essence of a journalist’s work is to relate an event for which they were not present - which they did not see themselves. Their role requires collating precise and verified information from primary sources in order to get a complete and clear picture of what happened. This is why it is so important to have a wide range of varied sources. 

A journalist can interview sources for many different and interconnected reasons - that is, there are all sorts of reasons why they might be newsworthy. They might meet them because they occupy an important post, or because they are investigating something important, like a scientific discovery. They might have won a prize, or be experts in a particular issue. They might know something or someone of relevance to an investigation, have seen something of importance (eyewitnesses to a crime, for example), or have had something happen to them (victims or survivors). We have to discriminate between sources. There are public figures who have earned the public’s trust and interest and who have worked hard to further the public interest in a particular area (be it big or small). Then there are temporary public figures who become briefly famous because of a particular incident.

IJ11

And then there are private or normal persons. Public figures are used to dealing with the media, to being pursued by cameras or journalists looking for statements. They know that their personal space is delimited by public space and by the right to criticise, unlike a private person who has no experience working with the media and has a right to privacy. 

The ideal human source in investigative journalism is the closest person to the story who can be relied upon and trusted, who is willing to stand by their statements publicly and who is easily and safely accessible. 

Journalists should look for human sources that meet all these criteria at the same time. How feasible it is to get access to sources varies from person to person and according to a journalist’s ability to convince them of the importance of being interviewed. 

Sources that are the object of suspicion or accusation will talk to a journalist when they believe that everyone else involved in the event either has talked or might talk to the journalist. And, of course, you need to make sure that access to the source is safe - a meeting with a mafioso or a violent organisation may put the journalist’s life in danger. 

An investigation may result in a battle of wills between a journalist looking to expose mistakes or wrongdoing and sources close to the event that may want to keep some information under wraps. Experts suggest interviewing victims, those hurt by an event or enemies of the parties under investigation first, because they are more likely to want to talk about the thing you are investigating. 

By communicating with and meeting sources, journalists aim to collect facts and important information and to gain access to documents and supporting evidence. They should ask for this directly and clearly: do you have any documents that support your story? 

The aim of an investigation is to produce a well-substantiated narrative. Sensational claims have no value so long as they are not backed up by convincing evidence. A meeting should also aim to acquire new sources for the investigation. Journalists should ask their sources to point them towards others who can corroborate their story: who knows about this other than you? Who was with you when it happened? What role did they play? Please give me their names and tell me how to contact them. This is a central part of the process of building a network of sources, and helps make sources feel important, as well as adding many new voices to the investigation. 

Journalists use human sources to corroborate the information they have gathered and the conclusions they have reached and to test the credibility of other sources. A good journalist knows that sources conflate facts and assumptions, and seeks the help of experts in analysing documents and interpreting events. Experts typically provide unbiased, in-depth and detailed information. 

A good journalist can also get sources to talk honestly. This requires effort, practice, politeness, and determination. It also requires sticking to your guns when necessary. 

The approach taken when interviewing victims is totally different from the approach used when interviewing those responsible for an incident, and different again from the approach to interviewing experts. What all of them have in common, however, is good preparation, and the greatest possible knowledge of the source and their personality as well as what they are looking to get out of the interview. 

IJ12

Some sources will have worries or fears that will mean they say no to an interview. Some may fear for their lives or for their families, particularly in oppressive or authoritarian countries. Others will be afraid of difficult questions, or fear for their professional or social future after publication. Some will be scared of scandal, of being held accountable, or of feeling guilty. Some will lack confidence in your professional ethics or probity and will worry that you will twist their words. And some will simply think they do not know enough about the subject under investigation, and may be embarrassed to talk to the press. 

Dealing with these anxieties and setting fears to rest requires journalists to engage in dialogue with their sources, to make multiple attempts, to be gentle and to negotiate. They should emphasise whatever might get them to speak – openness, justice, ambition, anger, exhibitionism or a desire for attention, a desire for authority, serving the public interest, or the opportunity to give their side of the story. 

Sources sometimes request anonymity for fear of the consequences of publication, especially when they are providing secret or ultra-sensitive information concerning corruption, organised crime or maladministration. The right of a journalist to keep their sources secret is enshrined in law in many countries, but it is also, more importantly, an ethical and professional duty that has become a key part of international human rights law. 

Investigative journalism, Part 1: How to decide what to investigate

Investigative journalism, Part 2: Hypothesis-based investigations

Investigative journalism in the digital age

Journalists should make every effort to establish why a source is requesting anonymity. They should establish whether they simply mean that they do not want their name or job title to be published, whether they do not want the information itself to be published, or both. 

When sources request protection of this kind, they usually mean not publishing the source of the information - not the information itself. But journalists should be careful to make sure of this. It is the information itself that is important. You should thus ask your sources who else is aware of the information they are providing. If there are several people, you should ask for their names and how to get in contact with them, without divulging your original source. You can then publish the information in your report and note that it has been confirmed by various sources - thereby protecting the original source. But if the source still refuses to allow publication of their name, then you have to respect their wishes.

The right to protect sources is not an absolute right. There are various understandable and justifiable exceptions subject to a three-part test: interest, legitimacy and legal necessity. This is the same test used for freedom of expression. Concealing your sources is a practice governed by rules. The use of anonymous sources has discredited many a journalist. Likewise, professional and ethical rules emphasise the necessity of providing your sources when publishing: the public has a right to examine the credibility of the information provided. You should only conceal sources’ identity under rare, exceptional and carefully defined circumstances. 

The following points should be borne in mind:

  • You should only promise not to mention a source’s name with the agreement of your editor or editor-in-chief. 
  • Anonymous sources should only be used for a clear and justifiable reason. 
  • Anonymous sources should only be used when there is no other option. 
  • Sources should be named in the report whenever possible. If you have to use an anonymous source, you should explain why within the report. 
  • An editor must weigh up the advantages and disadvantages of using an anonymous source. 
  • Anonymous sources should only be used with the agreement of all parties: the journalist, the source, and the news outlet. 
  • If you use an anonymous source, you must confirm the information provided using another source.

IJ13

Many experts in the ethics of journalism say that information should never be attributed to an anonymous source unless it is confirmed by at least four other sources.

In the Watergate investigation that brought down Nixon, Woodward and Bronstein did not depend on a single source as some have claimed. Deep Throat, the anonymous tipster, served only to confirm information and help them avoid major errors. 

When using child sources, you should always secure the consent of their parent or guardian to conduct an interview. It is better not to push them into agreeing. You should also bear in mind the necessity of hiding their identity if the interview will affect their future prospects. Journalists should be keen to protect children’s ideal lives and their mental and physical health from any risk created by the interview. 

Care should be taken when dealing with children’s testimony. Children are particularly likely to conflate fact and opinion, what they have seen and what they have heard. They are also particularly likely to exaggerate or misrepresent the truth. If the child is an eyewitness, you should let them talk spontaneously without pushing them or asking them yes-no questions. You should always rely primarily on sources who are of the legal age of responsibility and fully understand what they are doing in order to obtain precise, verified evidence. 

Digital sources

Government and private databases, websites, video and audio libraries and social media. These constitute a vast archive that, with analysis, can help you get at the facts. Digital sources provide strong and coherent evidence with historical background and context as well as authenticity. But this all depends on whether you are able to obtain the necessary data and information. 

Digital sources provide rapid, cross-border information. You can easily obtain data from the UN Security Council, the UNHCR or the Library of Congress from the comfort of your own home. Digital sources produce soulless investigative reporting if they are not supplemented by human sources close to the issue. They also require careful verification, because they are easy to modify and decontextualise. 

What constitutes perfect evidence? What constitutes substantiated evidence? What makes evidence unusable? An investigative journalist proves their hypothesis using one or more of the following techniques, depending on the subject matter, the nature of the investigation, and the environment in which they are working.

Intersecting sources and testimony

Witness testimony is one of the main forms of evidence used by investigative journalists to prove their hypotheses. Testimony must be provided by primary sources directly linked to the event, and should be provided by several different and independent sources who cannot have decided on their story beforehand. They should provide precise and coherent narratives that attest to the fact that the event took place and link the characters to it. For any piece of information to be acceptable, it requires corroboration by at least two known, independent sources. Several sources may depend on a single original source, and in this case they should all be taken as one and not treated as separate sources for the purposes of corroboration. 

An investigative journalist compares and inspects different accounts in order to work out where they intersect. They should test the credibility of each source and their narrative throughout the interview, and ask for evidence that supports their testimony. They should try to establish sources’ motives and weigh up the information acquired accordingly, making sure that it makes logical sense. 

IJ7

A sceptical attitude is one of the best defences against being manipulated, and even the most precise source should be treated with caution. A source should be able to take responsibility for their words and their actions - they should be of sound mind, for example, and ideally an adult. When dealing with traumatised victims, it is preferable to avoid interviewing them in order to extract statements: in such cases the victims may not be fully in control of their actions. 

An investigative journalist should work hard to get access to all the primary sources on both sides of the investigation - that is, for and against. They should have recourse to independent, neutral third parties familiar with the facts who can help them assess how accurate and trustworthy a narrative is. Where anonymous sources are used, you should follow the same steps given above - after making sure that the information is correct and reliable and that there is no other way of sourcing it from elsewhere. Anonymous sources whose names cannot be published are the least useful of all when it comes to proving anything. 

Majdolin Alan and Imad Rawashdeh, for example, used several intersecting sources to substantiate their findings when they investigated torture and sexual abuse in Jordanian government orphanages in 2009. 

They compared the accounts of 30 former and current orphanage residents, as well as testimony given by former and current orphanage employees. All of the testimony was corroborated and signed, in some cases using fingerprints where the witnesses were not literate. 

Alan and Rawashdeh’s investigation also drew on leaked documents from the orphanages themselves which supported their sources’ claims. They also made use of governmental and semi-governmental reports and studies that confirmed the existence of the problem. Some orphanage residents were convinced to undergo medical examinations at a civil society organisation, which confirmed that some of them were suffering from psychological, physical or sexual illnesses.

In another example, the Al Jazeera film Execution By Transport showed how thirty-seven political prisoners choked or were burned to death in a transport van at Egypt’s Abu Zaabal Prison in August 2013. 

The film was able to prove the events by drawing on overlapping eyewitness testimony from bystanders, survivors and Inspector Abdelaziz, a member of the prisoner protection detail. These accounts were bolstered with technical expertise from the coroner and from an engineering report on the transport van, which confirmed that the vehicle was carrying more than three times its capacity and had no ventilation or drinking water under extreme weather conditions. 

They also drew on meteorological reports, data provided by civil society organisations, and court documents from the official investigation. The overlapping human and material sources used in this investigation allowed it to develop information into facts: the van did not have sufficient space for 45 prisoners, the prisoners were kept inside for more than eight hours straight, the temperature was above 40 degrees, the van lacked ventilation and no drinking water was provided, and the coroner’s report listed asphyxiation as the cause of death. All of these facts confirmed that there was foul play at work and neglect of political prisoners opposed to the new regime. 

This article has been adapted from the AJMI Investigative Journalism Handbook

More Articles

A woman collects fish on the beach of Dong Yen fishing village next to Formosa factory of Vietnam's central Ha Tinh province March 31, 2017. Photo taken on March 31, 2017. REUTERSKham

Climate Journalism in Vietnam's Censored Landscape

In Vietnam, climate journalists face challenges due to censorship and restrictions on press freedom, making it difficult to report environmental issues accurately. Despite these obstacles, there are still journalists working to cover climate stories creatively and effectively, highlighting the importance of climate journalism in addressing environmental concerns.

Feminist artist Yoshiko Shimada, photo part of her book on feminism in Japan entitled «It’s not yours to decide»  (2023) Courtesy of Yoshiko Shimada

Challenges of Investigating Subculture Stories in Japan as a Foreign Correspondent

Japan's vibrant subcultures and feminist activists challenge the reductive narratives often portrayed in Western media. To understand this dynamic society authentically, journalists must approach their reporting with patience, commitment, and empathy, shedding preconceptions and engaging deeply with the nuances of Japanese culture.

Johann Fleuri

Covering the War on Gaza: As a Journalist, Mother, and Displaced Person

What takes precedence: feeding a hungry child or providing professional coverage of a genocidal war? Journalist Marah Al Wadiya shares her story of balancing motherhood, displacement, psychological turmoil, and the relentless struggle to find safety in an unsafe region.

Marah Al Wadiya

Fighting Misinformation and Disinformation to Foster Social Governance in Africa

Experts in Africa are using various digital media tools to raise awareness and combat the increasing usage of misinformation and disinformation to manipulate social governance.

Derick M

"I Am Still Alive!": The Resilient Voices of Gaza's Journalists

The Israeli occupation has escalated from targeting journalists to intimidating and killing their families. Hisham Zaqqout, Al Jazeera's correspondent in Gaza talks about his experience covering the war and the delicate balance between family obligations and professional duty.

hypothesis journalism

Under Fire: The Perilous Reality for Journalists in Gaza's War Zone

Journalists lack safety equipment and legal protection, highlighting the challenges faced by journalists in Gaza. While Israel denies responsibility for targeting journalists, the lack of international intervention leaves journalists in Gaza exposed to daily danger.

Linda Shalash

Elections and Misinformation – India Case Study

Realities are hidden behind memes and political satire in the battle for truth in the digital age. Explore how misinformation is influencing political decisions and impacting first-time voters, especially in India's 2024 elections, and how journalists fact-check and address fake news, revealing the true impact of misinformation and AI-generated content.

Safina

Amid Increasing Pressure, Journalists in India Practice More Self-Censorship

In a country where nearly 970 million people are participating in a crucial general election, the state of journalism in India is under scrutiny. Journalists face harassment, self-censorship, and attacks, especially under the current Modi-led government. Mainstream media also practices self-censorship to avoid repercussions. The future of journalism in India appears uncertain, but hope lies in the resilience of independent media outlets.

Hanan Zaffa

The Privilege and Burden of Conflict Reporting in Nigeria: Navigating the Emotional Toll

The internal struggle and moral dilemmas faced by a conflict reporter, as they grapple with the overwhelming nature of the tragedies they witness and the sense of helplessness in the face of such immense suffering. It ultimately underscores the vital role of conflict journalism in preserving historical memory and giving a voice to the voiceless.

Hauwa Shaffii Nuhu

Journalism in chains in Cameroon

Investigative journalists in Cameroon sometimes use treacherous means to navigate the numerous challenges that hamper the practice of their profession: the absence of the Freedom of Information Act, the criminalisation of press offenses, and the scare of the overly-broad anti-terrorism law.

Nalova Akua

The Perils of Journalism and the Rise of Citizen Media in Southeast Asia

Southeast Asia's media landscape is grim, with low rankings for internet and press freedom across the region. While citizen journalism has risen to fill the gaps, journalists - both professional and citizen - face significant risks due to government crackdowns and the collusion between tech companies and authorities to enable censorship and surveillance.

 Indian farmers march towards New Delhi to press for better crop prices. (Reuters, Shamhu Border -

Silenced Voices: The Battle for Free Expression Amid India’s Farmer’s Protest

The Indian government's use of legal mechanisms to suppress dissenting voices and news reports raises questions about transparency and freedom of expression. The challenges faced by independent media in India indicate a broader narrative of controlling the narrative and stifling dissenting voices.

Suvrat Arora

Targeting Truth: Assault on Female Journalists in Gaza

For female journalists in Palestine, celebrating international women's rights this year must take a backseat, as they continue facing the harsh realities of conflict. March 8th will carry little celebration for them, as they grapple with the severe risks of violence, mass displacement, and the vulnerability of abandonment amidst an ongoing humanitarian crisis. Their focus remains on bearing witness to human suffering and sharing stories of resilience from the frontlines, despite the personal dangers involved in their work.

Fatima Bashir

A Woman's Journey Reporting on Pakistan's Thrilling Cholistan Desert Jeep Rally

A Woman's Voice in the Desert: Navigating the Spotlight

Anam Hussain

Breaking Barriers: The Rise of Citizen Journalists in India's Fight for Media Inclusion

Grassroots journalists from marginalized communities in India, including Dalits and Muslims, are challenging mainstream media narratives and bringing attention to underreported issues through digital outlets like The Mooknayak.

Illustration by Walker Gawande

Why Journalists are Speaking out Against Western Media Bias in Reporting on Israel-Palestine

Over 1500 journalists from various US news organizations have signed an open letter criticizing the Western media's coverage of Israel's actions against Palestinians. They accuse newsrooms of dehumanizing rhetoric, bias, and the use of inflammatory language that reinforces stereotypes, lack of context, misinformation, biased language, and the focus on certain perspectives while diminishing others. They call for more accurate and critical coverage, the use of well-defined terms like "apartheid" and "ethnic cleansing," and the inclusion of Palestinian voices in reporting.

Belle de Jong journalist

Silenced Voices and Digital Resilience: The Case of Quds Network

Unrecognized journalists in conflict zones face serious risks to their safety and lack of support. The Quds Network, a Palestinian media outlet, has been targeted and censored, but they continue to report on the ground in Gaza. Recognition and support for independent journalists are crucial.

Yousef Abu Watfe يوسف أبو وطفة

Artificial Intelligence's Potentials and Challenges in the African Media Landscape

How has the proliferation of Artificial Intelligence impacted newsroom operations, job security and regulation in the African media landscape? And how are journalists in Africa adapting to these changes?

A Pakistani walks past a poster of Imran Khan, head of Pakistan Tehreek-e-Insaf party, at a market in Islamabad, Pakistan, Saturday, July 28, 2018. (AP Photo/Anjum Naveed)

Media Blackout on Imran Khan and PTI: Analysing Pakistan's Election Press Restrictions

Implications and response to media censorship and the deliberate absence of coverage for the popular former Prime Minister, Imran Khan, and his party, Pakistan Tehreek-e-Insaf (PTI), in the media during the 2024 elections in Pakistan.

Laptop screen on table with the word "censorship" on the screen

Digital Battlegrounds: The New Broadcasting Bill and Independent Journalism in India

New legislation in India threatens the freedom of independent journalism. The draft Broadcasting Services (Regulation) Bill, 2023 grants the government extensive power to regulate and censor content, potentially suppressing news critical of government policies.

Shot from the Back to Hooded Hacker Breaking into Corporate Data Servers

Pegasus Spyware: A Grave Threat to Journalists in Southeast Asia

The widespread deployment of spyware such as Pegasus in Southeast Asia, used by governments to target opposition leaders, activists, and journalists, presents significant challenges in countering digital surveillance. This is due to its clandestine operations and the political intricacies involved. The situation underscores the urgent need for international cooperation and heightened public awareness to address these human rights infringements.

A demonstration against Israel's war on Gaza on Paulista Avenue in São Paulo on November 4, 2023, draws attention to the deaths of children while the media focuses on the war against terrorists. [Photo: Lina Bakr]

Media Monopoly in Brazil: How Dominant Media Houses Control the Narrative and Stifle Criticism of Israel

An in-depth analysis exploring the concentration of media ownership in Brazil by large companies, and how this shapes public and political narratives, particularly by suppressing criticism of Israel.

Al Jazeera Logo

Cameroonian Media Martyrs: The Intersection of Journalism and Activism

Experts and journalists in Cameroon disagree on the relationship between journalism and activism: some say journalism is activism; others think they are worlds apart, while another category says a “very thin” line separate both

hypothesis journalism

Silent Suffering: The Impact of Sexual Harassment on African Newsrooms

Sexual harassment within newsrooms and the broader journalistic ecosystem is affecting the quality and integrity of journalistic work, ultimately impacting the organisation’s integrity and revenue.

Hypothesis for Publishers

Until now, researchers have had no uniform way of taking and organizing personal notes, collaborating with others, or engaging in public discussions on documents across the web. Existing solutions are limited in scope, proprietary and poor quality.

Publishers struggle to differentiate themselves when their content ends up as PDFs uploaded to large scholarly networks, where the true engagement happens.

We envision a world in which a standards based annotation capability functions everywhere on the web without restriction.

Your content can stay on your site, and can be enhanced by layers of annotation curated by authors, community members and others.

Author’s notes, invited discussions, published reviews, enhanced footnotes and better tools for researchers can give your journal the edge that increases submissions and keeps your readers at your site.

Your content shouldn’t have to be uploaded somewhere else in order to come alive.

Choose a community, not just a product .

Join the community that’s taking control of its future by adopting open-source, standards-based annotation technology instead of proprietary, siloed offerings that are mutually incompatible.

Hypothesis works at the center of the open annotation community.

  • We created the space, built the first framework, and drove the standards that now define it.
  • In 2015, we launched the Annotating All Knowledge coalition that now counts over 70 of the largest scholarly publishers and platforms.
  • We also host I Annotate , the industry conference on annotation now in its fifth year that brings together developers, educators, journalists, publishers, scholars and many others involved in bringing annotation to the world.

It works everywhere .

We offer a unique proposition — the ability for researchers, scholars, students and citizens to take notes on or collaborate around any content anywhere .

Services which only work where they’re embedded fundamentally fail the promise that standards-based web annotation offers, namely a single platform that works everywhere.

Why will your users adopt annotation if it only works on your content?

Open source .

Open-source technology brings many benefits — most fundamentally that others can contribute to, extend, or fork the code. It also helps protect you from being locked into technology that you can’t control. Hypothesis’ permissive BSD-2 license means that if you ever decide you want to run it yourself, you’re free to do so — no questions asked, no royalties due. We believe passionately that in order for this new paradigm to take root at web scale, it must be completely free and open in every sense.

Become a partner .

Hypothesis is a mission-focused organization supported by like-minded partners like you. We are here for the long haul and our open-source codebase cannot be bought by your competition, or anyone else who wants to shut it down.

A shared roadmap .

Our partners decide our roadmap and in many cases also fund new development, contributed back as open-source improvements to our shared codebase. As an example, NYU Press approached us to suggest that we work with the ReadiumJS core team to bring annotation to ebooks. That work will finish Summer 2017, after which it will be available for anyone to reuse for their own purposes.

Best in class features .

We believe open-source software should be substantially better than it’s closed counterparts. With industry firsts like private groups, publisher groups, moderation, UX customization, HTML <> PDF cross-format anchoring, DOI support, direct linking, a powerful faceted search function, an open API, support for MathML and rich media, Hypothesis has led essentially every innovation in this space, period. Our roadmap for the years ahead is packed with key improvements that will dramatically expand the capabilities and potential of web annotation from here.

Easy to implement .

Adding Hypothesis to your platform often just takes minutes. In its simplest form, it’s just a single line of Javascript added to your templates. We also offer open-source libraries to facilitate enabling annotation on PDFs, and full documentation for those who want to integrate with our API or do even more.

Annotation in action

Post-publication discussion, annotation as content, publisher groups, teaching & learning.

Screenshot of annotations added to a published scientific journal article.

Our partners

Logo for American Association for the Advancement of Science.

Our memberships

Logo for Crossref.

  • Follow us on Facebook
  • Follow us on Twitter
  • Criminal Justice
  • Environment
  • Politics & Government
  • Race & Gender

Expert Commentary

5 things journalists need to know about statistical significance

Statistical significance is a highly technical, nuanced mathematical concept. Journalists who cover academic research should have a basic understanding of what it represents and the controversy surrounding it.

Statistical significance research journalists should know

Republish this article

Creative Commons License

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License .

by Denise-Marie Ordway, The Journalist's Resource June 23, 2022

This <a target="_blank" href="https://journalistsresource.org/home/statistical-significance-research-5-things/">article</a> first appeared on <a target="_blank" href="https://journalistsresource.org">The Journalist's Resource</a> and is republished here under a Creative Commons license.<img src="https://journalistsresource.org/wp-content/uploads/2020/11/cropped-jr-favicon-150x150.png" style="width:1em;height:1em;margin-left:10px;">

It’s easy to misunderstand and misuse one of the most common — and important — terms in academic research: statistical significance. We created this tip sheet to help journalists avoid some of the most common errors, which even trained researchers make sometimes.

When scholars analyze data, they look for patterns and relationships between and among the variables they’re studying. For example, they might look at data on playground accidents to figure out whether children with certain characteristics are more likely than others to suffer serious injuries. A high-quality statistical analysis will include separate calculations that researchers use to determine statistical significance, a form of evidence that indicates how consistent the data are with a research hypothesis.

Statistical significance is a highly technical, nuanced concept, but journalists covering research should have a basic understanding of what it represents. Health researchers Steven Tenny and Ibrahim Abdelgawad frame statistical significance like this: “In science, researchers can never prove any statement as there are infinite alternatives as to why the outcome may have occurred. They can only try to disprove a specific hypothesis.”

Researchers try to disprove what’s called the null hypothesis, which is “typically the inverse statement of the hypothesis,” Tenny and Abdelgawad write. Statistical significance indicates how inconsistent the data being examined are with the null hypothesis.

If researchers studying playground accidents hypothesize that children under 5 years old suffer more serious injuries than older kids, the null hypothesis could be there is no relationship between a child’s age and playground injuries. If a statistical analysis uncovers a relationship between the two variables and researchers determine that relationship to be statistically significant, the data are not consistent with the null hypothesis.

To be clear, statistical significance is evidence used to decide whether to reject or fail to reject the null hypothesis. Getting a statistically significant result doesn’t prove anything.

Here are some other things journalists should know about statistical significance before reporting on academic research:

1. In academic research, significant ≠ important.

Sometimes, journalists mistakenly assume that research findings described as “significant” are important or noteworthy — newsworthy. That’s typically not correct. To reiterate, when researchers call a result “statistically significant,” or simply “significant,” they’re indicating how consistent the data are with their research hypothesis.

It’s worth noting that a finding can be statistically significant but have little or no clinical or practical significance. Let’s say researchers conclude that a new drug drastically reduces tooth pain, but only for a few minutes. Or that students who complete an expensive tutoring program earn higher scores on the SAT college-entrance exam — but only two more points, on average. Although these findings might be significant in a mathematical sense, they’re not very meaningful in the real world.

2. Researchers can manipulate the process for gauging statistical significance.

Researchers use sophisticated software to analyze data. For each pattern or relationship detected in the data — for instance, one variable increases as another decreases — the software calculates what’s known as a probability value, or p-value.

P-values range from 0 to 1. If a p-value falls under a certain threshold, researchers deem the pattern or relationship statistically significant. If the p-value is greater than the cutoff, that pattern or relationship is not statistically significant. That’s why researchers hope for low p-values.

Generally speaking, p-values smaller than 0.05 are considered statistically significant.

“P-values are the gatekeepers of statistical significance,” science writer Regina Nuzzo , who’s also a statistics professor at Gallaudet University in Washington D.C., writes in her tip sheet, “ Tips for Communicating Statistical Significance .”

She adds, “What’s most important to keep in mind? That we use p-values to alert us to surprising data results, not to give a final answer on anything.”

Journalists should understand that p-values are not the probability that the hypothesis is true. P-values also do not reflect the probability that the relationships in the data being studied are the result of chance. The American Statistical Association warns against repeating these and other errors in its “ Statement on Statistical Significance and P-Values .”

And p-values can be manipulated. One form of manipulation is p-hacking , when a researcher “persistently analyzes the data, in different ways, until a statistically significant outcome is obtained,” explains psychiatrist Chittaranjan Andrade , a senior professor at the National Institute of Mental Health and Neurosciences in India, in a 2021 paper in The Journal of Clinical Psychiatry.

He adds that “the analysis stops either when a significant result is obtained or when the researcher runs out of options.”

P-hacking includes:

  • Halting a study or experiment to examine the data and then deciding whether to gather more.
  • Collecting data after a study or experiment is finished, with the goal of changing the result.
  • Putting off decisions that could influence calculations, such as whether to include outliers, until after the data has been analyzed.

As a real-world example, many news outlets reported on problems found in studies by Cornell University researcher Brian Wansink , who announced his retirement shortly after JAMA, the flagship journal of the American Medical Association, and two affiliated journals retracted six of his papers in 2018.

Stephanie Lee , a science reporter at BuzzFeed News, described emails between Wansink and his collaborators at the Cornell Food and Brand Lab showing they “discussed and even joked about exhaustively mining datasets for impressive-looking results.”

3. Researchers face intense pressure to produce statistically significant results.

Researchers build their careers largely on how often their work is published and the prestige of the academic journals that publish it. “‘Publish or perish’ is tattooed on the mind of every academic,” Ione Fine , a psychology professor at the University of Washington, and Alicia Shen , a doctoral student there, write in a March 2018 article in The Conversation. “Like it or loathe it, publishing in high-profile journals is the fast track to positions in prestigious universities with illustrious colleagues and lavish resources, celebrated awards and plentiful grant funding.”

Because academic journals often prioritize research with statistically significant results, researchers often focus their efforts in that direction. Multiple studies suggest journals are more likely to publish papers featuring statistically significant findings.

For example, a paper published in Science in 2014 finds “a strong relationship between the results of a study and whether it was published.” Of the 221 papers examined, about half were published. Only 20% of studies without statistically significant results were published.

The authors learned that most studies without statistically significant findings weren’t even written up, sometimes because researchers, predicting their results would not be published, abandoned their work.

“When researchers fail to find a statistically significant result, it’s often treated as exactly that — a failure,” science writer Jon Brock writes in a 2019 article for Nature Index. “Non-significant results are difficult to publish in scientific journals and, as a result, researchers often choose not to submit them for publication.”

4. Many people — even researchers — make errors when trying to explain statistical significance to a lay audience.

“With its many technicalities, significance testing is not inherently ready for public consumption,” Jeffrey Spence and David Stanley , associate professors of psychology at the University of Guelph in Canada, write in the journal Frontiers in Psychology .“Properly understanding technically correct definitions is challenging even for trained researchers, as it is well documented that statistical significance is frequently misunderstood and misinterpreted by researchers who rely on it.”

Spence and Stanley point out three common misinterpretations, which journalists should look out for and avoid. Statistical significance, they note, does not mean:

  • “There is a low probability that the result was due to chance.”
  • “There is less than a 5% chance that the null hypothesis is true.”
  • “There is a 95% chance of finding the same result in a replication.”

Spence and Stanley offer two suggestions for describing statistical significance. Although both are concise, many journalists (or their editors) might consider them too vague to use in news stories.

If all study results are significant, Spence and Stanley suggest writing either:

  • “All of the results were statistically significant (indicating that the true effects may not be zero).”
  • “All of the results were statistically significant (which suggests that there is reason to doubt that the true effects are zero).”

5. The academic community has debated for years whether to stop checking for and reporting statistical significance.

Scholars for decades have written about the problems associated with determining and reporting statistical significance. In 2019, the academic journal Nature published a letter , signed by more than 800 researchers and other professionals from fields that rely on statistical modelling, that called “for the entire concept of statistical significance to be abandoned.”

The same year, The American Statistician, a journal of the American Statistical Association , published “ Statistical Inference in the 21st Century: A World Beyond p < 0.05 ” — a special edition featuring 43 papers dedicated to the issue. Many propose alternatives to using p-values and designated thresholds to test for statistical significance .

“As we venture down this path, we will begin to see fewer false alarms, fewer overlooked discoveries, and the development of more customized statistical strategies,” three researchers write in an editorial that appears on the front page of the issue. “Researchers will be free to communicate all their findings in all their glorious uncertainty, knowing their work is to be judged by the quality and effective communication of their science, and not by their p-values.

John Ioannidis , a Stanford Medicine professor and vice president of the Association of American Physicians, has argued against ditching the process. P-values and statistical significance can provide valuable information when used and interpreted correctly, he writes in a 2019 letter published in JAMA . He acknowledges improvements are needed — for example, better and “less gameable filters” for gauging significance. He also notes “the statistical numeracy of the scientific workforce requires improvement.”

Professors Deborah Mayo of Virginia Tech and David Hand of Imperial College London assert that “recent recommendations to replace, abandon, or retire statistical significance undermine a central function of statistics in science.” Researchers need, instead, to call out misuse and avoid it, they write in their May 2022 paper, “ Statistical Significance and Its Critics: Practicing Damaging Science, or Damaging Scientific Practice? ”

“The fact that a tool can be misunderstood and misused is not a sufficient justification for discarding that tool,” they write.

Need more help interpreting research? Check out the “ Know Your Research ” section of our website. We provide tips and explainers on topics such as the peer-review process , covering scientific consensus and avoiding mistakes in news headlines about health and medical research.

The Journalist’s Resource would like to thank Ivan Oransky , who teaches medical journalism at New York University’s Carter Journalism Institute and is co-founder of Retraction Watch, and Regina Nuzzo , a science journalist and statistics professor at Gallaudet University, for reviewing this tip sheet and offering helpful feedback.

About The Author

' src=

Denise-Marie Ordway

How does diabetes start? A new study suggests it begins in the gut

By studying the different bacteria and viruses in the bowels of diabetes patients, we could develop new treatments, by matthew rozsa.

Diabetes has been a well-known condition since ancient times , described 1500 years before Christ was born, in the Egyptian medical text the Ebers papyrus. Modern doctors thought they knew how it manifested: when the pancreas struggles to process insulin and therefore your blood glucose (or blood sugar) becomes too high.

But over the last several years, scientists have started looking at the gut microbiome — the menagerie of bacteria, fungi, viruses and other microbes that live in our bowels and impact our health — for a clues in the way diabetes develops. A recent study in the journal Nature Medicine reports that diabetes could be due to changes in the microbiome and the changes a body goes through when it develops the condition may even start there.

As diabetes rates continue to rise in the United States, the medical field continues to seek effective treatments for the debilitating disease. Patients with type 2 diabetes, for example, experience symptoms such as fatigue, thirst, frequent urination, tingling sensations and regular infections. If left untreated, patients with type 2 diabetes can suffer from kidney damage, eye damage, heart attacks and strokes.

"We are confident that the observed changes in the gut microbiome happen first and that diabetes develops later, not the other way around."

For decades, doctors have treated conditions like type 2 diabetes with medicines like metformin and SGLT2 inhibitors or through insulin injections.

What this surprising new research suggests is that treatment for diabetes could extend beyond the blood or pancreas, instead focusing onthe microorganisms which reside in our guts.

"Although our study is mainly hypothesis-generating and cannot be seen as direct evidence for causal inference, our detailed analysis (including many sensitivity analyses) supports that our findings of microbial features of diabetes are unlikely due to reverse causality — that is, the pathological changes of diabetes cause microbial changes," Dr. Daniel (Dong) Wang told Salon.

The study he co-authored includes the largest and most diverse analysis of gut microbiomes ever created for people with type 2 diabetes (T2D), prediabetes and healthy glucose status. In the process, the researchers from Brigham and Women’s Hospital, the Broad Institute of MIT and Harvard, and Harvard T.H. Chan School of Public Health discovered both specific viruses and genetic variants in specific bacteria which correspond to T2D risk.

"Therefore, we are confident that the observed changes in the gut microbiome happen first and that diabetes develops later, not the other way around," Wang said. "However, future prospective or interventional studies are needed to prove causation firmly."

That said, there are some things which the researchers determined for sure. First, there are 19 phylogenetically diverse species of microorganisms that live in human guts which are associated with T2D, including enriched Clostridium bolteae and depleted Butyrivibrio crossotus . Additionally, "our study identifies within-species phylogenetic diversity for strains of 27 species that explain inter-individual differences in T2D risk, such as Eubacterium rectale ," the authors explain.

Perhaps the paper's most important contribution to understanding T2D is that it firmly establishes that different species of microbes are linked to with varying levels of diabetes risk. Even though scientists have yet to establish exactly  why  these microbes are associated with diabetes, simply knowing for sure that this is the case is an important first step. Think of it as a sort of police lineup: it'll be easier to determine what causes diabetes in the future if we know what the potential "criminals" look like.

Want more health and science stories in your inbox? Subscribe to Salon's weekly newsletter Lab Notes .

"These results lay the groundwork for future mechanistic studies."

"Different microbial strains, even within the same microbial species, are associated with different diabetes risks," Wang said. "The differences in the association can be explained by different genetic makeups and, therefore, functions of the strains."

When medical researchers apply the findings from the latest study, Wang believes they can use microbial features as biomarkers in order to help patients predict their risk for developing diabetes. That is only the beginning.

"If future mechanistic studies can confirm specific microbial strains are causally related to diabetes risk, we could develop intervention measures, such as dietary supplements or pharmacological approaches that target the specific microbial strains to prevent and treat diabetes," Wang said.

The last few years have seen an explosion of research into humans' gut microbiomes. Scientists have learned about the gut-brain axis, in which the gut biome helps control our cravings and may also be linked to neurological diseases. Technologies like fecal transplants are being considered to treat conditions like ulcerative colitis and yes, diabetes . In addition to helping us fight diseases and decide what we eat, gut microbiota are also believed to play an essential role in helping humans digest food that their digestive tracts cannot process on their own.

We need your help to stay independent

"Evidence suggests that gut microbes and their human host share much of the same metabolic machinery, with bacteria influencing which dietary components and how much energy their human host is able to extract from its diet," the Institute of Medicine (US) Food Forum said in 2013 . "What we eat and drink, in turn, influences the microbiome, with significant implications for disease risk. This growing understanding of the role of diet in microbiome-human interactions is driving interest and investment in probiotic and prebiotic food products as a means to help build and maintain health."

In this respect, Wang suggests that the new Nature Medicine study may truly pioneer new ways of understanding humans' gut microbes.

"It offers the most comprehensive evidence to date of the gut microbiome’s involvement in the pathogenesis of T2D from the population study perspective," Wang said. "These results lay the groundwork for future mechanistic studies. Additionally, we provide a more nuanced understanding of the biology and pathogenicity of microorganisms by studying the genetic makeup and characteristics of microbial strains, bringing us one step closer to causality. Our findings provide evidence for the gut microbiome’s potential functional role in the pathogenesis of T2D, and highlight the identification of taxonomic and functional biomarkers for future diagnostic applications."

about medicine:

  • Coach's death spotlights pharma's profit wins in chemo drug shortage
  • Lemony smoke-it: Citrus-scented weed may make you less paranoid, scientists report
  • A one-shot vaccine for COVID, flu and future viruses? Researchers say it's coming​​​​​​​

Matthew Rozsa is a staff writer at Salon. He received a Master's Degree in History from Rutgers-Newark in 2012 and was awarded a science journalism fellowship from the Metcalf Institute in 2022.

Related Topics ------------------------------------------

Related articles.

hypothesis journalism

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

microorganisms-logo

Article Menu

hypothesis journalism

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

A critical analysis of all-cause deaths during covid-19 vaccination in an italian province.

hypothesis journalism

Share and Cite

Alessandria, M.; Malatesta, G.M.; Berrino, F.; Donzelli, A. A Critical Analysis of All-Cause Deaths during COVID-19 Vaccination in an Italian Province. Microorganisms 2024 , 12 , 1343. https://doi.org/10.3390/microorganisms12071343

Alessandria M, Malatesta GM, Berrino F, Donzelli A. A Critical Analysis of All-Cause Deaths during COVID-19 Vaccination in an Italian Province. Microorganisms . 2024; 12(7):1343. https://doi.org/10.3390/microorganisms12071343

Alessandria, Marco, Giovanni M. Malatesta, Franco Berrino, and Alberto Donzelli. 2024. "A Critical Analysis of All-Cause Deaths during COVID-19 Vaccination in an Italian Province" Microorganisms 12, no. 7: 1343. https://doi.org/10.3390/microorganisms12071343

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

IMAGES

  1. Investigative journalism: Hypothesis-based investigations

    hypothesis journalism

  2. Untitled

    hypothesis journalism

  3. Research Hypothesis: Definition, Types, Examples and Quick Tips

    hypothesis journalism

  4. Investigative journalism: Hypothesis-based investigations

    hypothesis journalism

  5. HYPOTHESIS/NTA UGC NET JRF/ MASS COMMUNICATION AND JOURNALISM/ PAPER-2

    hypothesis journalism

  6. 🏷️ Formulation of hypothesis in research. How to Write a Strong

    hypothesis journalism

VIDEO

  1. HYPOTHESIS in 3 minutes for UPSC ,UGC NET and others

  2. OSJ Journalistic Investigation (JT010)

  3. Lesson 33 : Hypothesis Testing Procedure for One Population Mean

COMMENTS

  1. Guide to critical thinking, research, data and theory: Overview for

    In daily journalism, we are often content to quote relevant sources or officials, and let them do the "explaining." ... Hypothesis: A conjectured relationship between two phenomena. Like laws, hypotheses can be causal ("I surmise that 'A' causes 'B' ") and non-causal ("I surmise that 'A' and 'B' are caused by 'C ...

  2. The journalistic method: Five principles for blending analysis and

    Forming a hypothesis. It's healthier to admit to yourself that you have one than to go into a story with the idea that you have no presuppositions at all - that would be impossible. ... Nicholas Lemann is the Joseph Pulitzer II and Edith Pulitzer Moore Professor of Journalism and Dean Emeritus at the Columbia University Graduate School of ...

  3. Investigative journalism: Hypothesis-based investigations

    Investigative journalism, like science, is about coming up with hypotheses, testing them, and trying to prove them. The best examples of investigative journalism are rooted in a hypothesis that allows them to work out what happened, how it happened, and why it happened. In investigative journalism, a hypothesis is a proposed explanation that ...

  4. The Riemann Hypothesis, the Biggest Problem in Mathematics, Is a Step

    The Riemann hypothesis is the most important open question in number theory—if not all of mathematics. It has occupied experts for more than 160 years. ... On supporting science journalism. If ...

  5. Advancing Journalism and Communication Research: New Concepts, Theories

    Most importantly, globalization has exposed the Western bias of much of the field's theoretical and conceptual work (Gunaratne, 2010; Willems, 2014), which privileges and universalizes Western media, journalism practices, and politics.With Western-centrism reproduced over generations of scholars, the inequality between "the West" and "the rest" has divided our disciplinary viewpoint ...

  6. Theories of Journalism

    Theories of journalism, as Löffelholz ( 2008) observed, come from diverse perspectives, beginning with early normative concerns leading to more empirical analysis of how journalists work. Adding a systems perspective attempted to position the individual as part of a larger system (e.g., Rühl, 1969) and to understand news as a cultural product.

  7. Confirmation bias in journalism: What it is and strategies to avoid it

    In journalism, confirmation bias can influence a reporter's assessment of whether a story is worth pitching and an editor's decision to greenlight a story pitch. If the pitch is accepted, it can determine the questions the reporter decides to ask — or declines to ask — while investigating the story. It can affect an editor's choice to ...

  8. Digital Journalism and Epistemologies of News Production

    Data journalism has mainstreamed hypothesis-testing and data-driven logics within journalism, although epistemological tensions still emerge when traditional journalists work alongside their more data-oriented counterparts (Borges-Rey, 2020). However, although the production of data journalism marks an epistemological shift from traditional ...

  9. Full article: Clarifying Journalism's Quantitative Turn

    Journalism appears to be taking, as Petre ( 2013) puts it, "a quantitative turn.". This wave of quantitatively oriented journalism has deep democratic roots; various forms of it are tied to open government advocacy (Parasie and Dagiral 2013) and the public-service tradition of investigative journalism (Cox 2000 ).

  10. Building a hypothesis for your next data story

    Introduction. 2020 pulled data journalism in two drastically different directions. On the one hand, the Black Lives Matter movement forced the data journalism community to question equity in the field: who is data journalism produced by, for and about? On the other hand, the pandemic offered a plethora of opportunities to channel the firehouse of coronavirus into shiny, often impersonal ...

  11. Investigative journalism: Handling data and gathering evidence

    In Part 5 of our series on investigative journalism, we look at different methods of gathering evidence. More than 130 countries have adopted laws encouraging the sharing of information and 76 of them have become part of the Open Government Partnership. This has improved the flow of data from governments to the media and onto the internet.

  12. Investigative journalism: How to develop and manage your sources

    Investigative journalism, Part 1: How to decide what to investigate. Investigative journalism, Part 2: Hypothesis-based investigations. Investigative journalism in the digital age Journalists should make every effort to establish why a source is requesting anonymity.

  13. What Is Investigative Journalism?

    Veteran trainers note that the best investigative journalism employs a careful methodology, with heavy reliance on primary sources, forming and testing a hypothesis, and rigorous fact-checking. The dictionary definition of "investigation" is "systematic inquiry," which typically cannot be done in a day or two; a thorough inquiry ...

  14. Testing the inadvertency hypothesis: Incidental news exposure and

    The inadvertency hypothesis predicts that people encounter political difference in social media spaces not by design, but rather as a by-product of social media's affordances and cultural logics. ... Journalism 19(5): 632-648. Crossref. ISI. Google Scholar. Bakshy E, Messing S, Adamic L (2015) Exposure to ideologically diverse news and ...

  15. Journalism & Fact-Checking : Hypothesis

    Event: Journalism & Fact-checking at I Annotate 2017. Fact-checkers and journalists came together at the fifth-annual I Annotate conference to talk about how annotation is enriching the ways we produce and consume both news and facts. Climate change • Climate change refers to long-term shifts in temperatures and weather patterns.

  16. Journalists' Use of Social Media to Infer Public Opinion: The Citizens

    Hypothesis 4 (the more an individual self-censors, the less likely they are to positively perceive journalistic use of social media data) is not consistently supported by our data. Notably, the regression using our factor has a higher adjusted R 2 and the largest difference from Steps 1 to 2 (about 4.4 percentage points).

  17. Knowledge gap hypothesis

    The knowledge gap hypothesis is a mass communication theory based on how a member in society processes information from mass media differently based on education level and socioeconomic status (SES). The gap in knowledge exists because a member of society with higher socioeconomic status has access to higher education and technology whereas a member of society who has a lower socioeconomic ...

  18. Hypothesis for Publishers : Hypothesis

    Hypothesis works at the center of the open annotation community. We created the space, built the first framework, and drove the standards that now define it. In 2015, we launched the Annotating All Knowledge coalition that now counts over 70 of the largest scholarly publishers and platforms. We also host I Annotate, the industry conference on ...

  19. 5 things journalists need to know about statistical significance

    Statistical significance, they note, does not mean: "There is a low probability that the result was due to chance.". "There is less than a 5% chance that the null hypothesis is true.". "There is a 95% chance of finding the same result in a replication.". Spence and Stanley offer two suggestions for describing statistical significance.

  20. Story-based inquiry: a manual for investigative journalists

    The story is based on a necessary minimum of information and can be very short. The story is based on the obtainable maximum of information, and can be very long. The declarations of sources can substitute for documentation. The reportage requires documentation to support or deny the declarations of sources.

  21. Full article: How Do Investigative Journalists Initiate Their Stories

    Introduction. Investigative journalism has attracted the attention of both professionals and academic researchers in recent years (Carson Citation 2020, 5).In a context of widespread transformation in the media field, investigative journalism is perceived both as a way for journalism to survive (Carson Citation 2020; Hamilton Citation 2016; Knobel Citation 2018) and as an endangered species ...

  22. Revisiting the Knowledge Gap Hypothesis: Education, Motivation, and

    Abstract. The findings of this study support the significance of motivational variables and media use in modifying the relationship between education and knowledge acquisition. People's behavioral involvement in the 1992 presidential campaign influenced the knowledge gap between education groups such that the gap was significantly smaller among ...

  23. How does diabetes start? A new study suggests it begins in the gut

    "Although our study is mainly hypothesis-generating and cannot be seen as direct evidence for causal inference, our detailed analysis (including many sensitivity analyses) supports that our ...

  24. Microorganisms

    Immortal time bias (ITB) is common in cohort studies and distorts the association estimates between the treated and untreated. We used data from an Italian study on COVID-19 vaccine effectiveness, with a large cohort, long follow-up, and adjustment for confounding factors, affected by ITB, with the aim to verify the real impact of the vaccination campaign by comparing the risk of all-cause ...

  25. Mass Media Flow and Differential Distribution of Politically Disputed

    These changes address epistemological and micro-level critiques of the knowledge gap hypothesis while extending it in new directions. Evidence supported the hypotheses that ideology would be a better predictor than education of beliefs about the existence of global warming, but not its causes, and that the "belief gap" between conservatives ...