Fighting the Fake: A Forensic Linguistic Analysis to Fake News Detection

International Journal for the Semiotics of Law - Revue internationale de Sémiotique juridique, Apr 2022

Fake news has been the focus of debate, especially since the election of Donald Trump (2016), and remains a topic of concern in democratic countries worldwide, given (a) their threat to democratic systems and (b) the difficulty in detecting them. Despite the deployment of sophisticated computational systems to identify fake news, as well as the streamlining of fact-checking methods, appropriate fake news detection mechanisms have not yet been found. In fact, technological approaches are likely to be inefficient, given that fake news are based mostly on partisanship and identity politics, and not necessarily on outright deception. However, as disinformation is inherently expressed linguistically, this is a privileged room for forensic linguistic analysis. This article builds upon a forensic linguistic analysis of fake news pieces published in English and in Portuguese, which were collected since 2019 from acknowledged fake news outlets. The preliminary empirical analysis reveals that fake news pieces employ particular linguistic features, e.g. at the levels of typography, orthography and spelling, and morphosyntax. The systematic identification of these features, which will allow mapping linguistic resources and patterns used in those contexts, contributes to scholarship, not only by enabling a streamlined development of computational detection systems, but more importantly by permitting the forensic linguistics expert to assist criminal investigations and give evidence in court.

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s11196-022-09901-w.pdf

Fighting the Fake: A Forensic Linguistic Analysis to Fake News Detection

Int J Semiot Law https://doi.org/10.1007/s11196-022-09901-w Fighting the Fake: A Forensic Linguistic Analysis to Fake News Detection Rui Sousa‑Silva1,2 Accepted: 6 April 2022 © The Author(s), under exclusive licence to Springer Nature B.V. 2022 Abstract Fake news has been the focus of debate, especially since the election of Donald Trump (2016), and remains a topic of concern in democratic countries worldwide, given (a) their threat to democratic systems and (b) the difficulty in detecting them. Despite the deployment of sophisticated computational systems to identify fake news, as well as the streamlining of fact-checking methods, appropriate fake news detection mechanisms have not yet been found. In fact, technological approaches are likely to be inefficient, given that fake news are based mostly on partisanship and identity politics, and not necessarily on outright deception. However, as disinformation is inherently expressed linguistically, this is a privileged room for forensic linguistic analysis. This article builds upon a forensic linguistic analysis of fake news pieces published in English and in Portuguese, which were collected since 2019 from acknowledged fake news outlets. The preliminary empirical analysis reveals that fake news pieces employ particular linguistic features, e.g. at the levels of typography, orthography and spelling, and morphosyntax. The systematic identification of these features, which will allow mapping linguistic resources and patterns used in those contexts, contributes to scholarship, not only by enabling a streamlined development of computational detection systems, but more importantly by permitting the forensic linguistics expert to assist criminal investigations and give evidence in court. Keywords Forensic linguistics · Cybercrime · Language crimes · Fake news · Factchecking · Disinformation * Rui Sousa‑Silva 1 Faculty of Arts and Humanities, CLUP ‑ Centre for Linguistics of the University of Porto, University of Porto, Porto, Portugal 2 Faculdade de Letras, Universidade do Porto, Via Panorâmica, s/n, 4150‑564 Porto, Portugal 13 Vol.:(0123456789) R. Sousa‑Silva 1 Introduction Over the last two years, the world has faced a serious COVID-19 pandemic, which resulted in millions of deaths, long-lasting global lockdowns, and limitations on free movement due, e.g., to travel bans. At the time of writing, the world is also witnessing a war triggered by Russia’s invasion of Ukraine, which has originated polarized views and positions. Alongside this literal, health pandemic and this war, the world has also been stricken by two other inter-related, technologically motivated, metaphorical wars and pandemics: cybercrime and disinformation. Although neither of these phenomena emerged during the COVID-19 pandemic, they are known to have skyrocketed during the pandemic. In the UK, for instance, not only has the number of cybercriminal activities increased during the pandemic, but also this number was especially high during the period in which the lockdown policies and measures were the strictest [7]. Among the most common categories of cybercrime that were subject to this increase, the authors found frauds related to online shopping and auctions, and hacking of social media and email, targeting, in particular, individual victims. At the same time, the uncertainty underlying COVID-19 treatments and public policies, as well as the polarized views of the Russia–Ukraine war, enabled the exploitation of platforms such as those offered by online and social media to spread disinformation and conspiracy theories, which are among the main threats to public health [19] and, ultimately, to democracy [2]. However, neither cybercrime, not disinformation are new phenomena that emerged as a result of the COVID-19 pandemic, or the Russia–Ukraine war. Indeed, the technological developments of the last decades have brought along new communication possibilities across the world, and, with them, new cybercriminal activities and structured disinformation campaigns. These activities, despite taking place online, are neither merely virtual, nor unrelated to offline crime; cybercrime has been found to mimic and adapt reality [40], by using technology-enabled (online) possibilities. Cybercriminals thus tend to use sophisticated technological methods and techniques to perform their criminal activities anonymously, including: cybertrespass, e.g. unauthorized access to passwords, identity theft, or destruction of sensitive information; cyberporn, including e.g. illegal use of pornographic contents, unauthorized use of nudity, sexual exploitation, extortion and ‘revenge porn’ (in which contents such as nudes are disseminated publicly without permission); cyberviolence, e.g. defamation, cyberthreats, dissemination of dangerous/harmful contents, online harassment, cyber-bullying/ cyber-stalking and hate speech, often leading to physical/emotional trauma or death; and cyber-deception/theft, such as illegal access to information/materials online, theft of intellectual property online/digital piracy [74]. The latter group, in addition, includes new cybercriminal activities, such as ‘doxing’ (in which someone else’s email address and real name is revealed online against their will) and fake news, disinformation and misinformation activities, which have been considered to pose threats to democracy worldwide [75]. The recent technological developments did not have an impact on more evident cybercriminal activities only. Journalism, too, has changed dramatically lately largely because the new information and communication technologies, and in 13 Fighting the Fake: A Forensic Linguistic Analysis to Fake News… particular the sharing possibilities offered by the social media, now allow information to be disseminated immediately and widely without much effort. Alongside these developments, participatory journalism principles and policies, which have been promoted in schools of communication throughout the world over the last decades, gave ordinary citizens the power to collect, report, analyse and—in particular—disseminate news and information, actively and timely [5]. Of course, participatory (or citizen) journalism is not nefarious in itself, rather the opposite; its advantages are obvious. For the society in general, the active involvement of ordinary citizens in news-gathering results in an easier and faster access to newsworthy information; for media conglomerates, involving citizens in media communication meant that they could release part of their human (and consequently financial) resources, and thus significantly improve their financial sustainability prospects—and, in turn, eventually guarantee the independence of media outlets; for citizens, the main advantage of participatory journalism is that it confers them an opportunity to have a say on what information they value most. Importantly, however, participatory journalism does not come without disadvantages, (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007/s11196-022-09901-w.pdf
Article home page: https://link.springer.com/article/10.1007/s11196-022-09901-w

Sousa-Silva, Rui. Fighting the Fake: A Forensic Linguistic Analysis to Fake News Detection, International Journal for the Semiotics of Law - Revue internationale de Sémiotique juridique, 2022, pp. 1-25, DOI: 10.1007/s11196-022-09901-w