The limitations of irony detection in Dutch social media

Language Resources and Evaluation, Jul 2023

In this paper, we explore the feasibility of irony detection in Dutch social media. To this end, we investigate both transformer models with embedding representations, as well as traditional machine learning classifiers with extensive feature sets. Our feature-based methodology implements a variety of information sources including lexical, semantic, syntactic, sentiment features, as well as two new data-driven features to model common sense. Based on patterns in the syntactic structure of tweets, we aim to model the presence of contrasting sentiments, a phenomenon that is known to be indicative of verbal irony and sarcasm. Feature selection, as well as voting ensemble techniques were implemented to enhance the classification performance. The final systems reach F1-scores up to 0.79, which are promising results for a task as difficult as irony detection. Besides a quantitative analysis, this paper also describes a thorough qualitative analysis of the system output. Although lexical cues appear to be very important to express irony, our analysis also revealed the need for more advanced modeling of common-sense knowledge to detect more subtle examples of irony.

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s10579-023-09656-1.pdf

The limitations of irony detection in Dutch social media

Language Resources and Evaluation https://doi.org/10.1007/s10579-023-09656-1 ORIGINAL PAPER The limitations of irony detection in Dutch social media Aaron Maladry1 · Els Lefever1 · Cynthia Van Hee1 · Véronique Hoste1 Accepted: 27 March 2023 © The Author(s) 2023 Abstract In this paper, we explore the feasibility of irony detection in Dutch social media. To this end, we investigate both transformer models with embedding representations, as well as traditional machine learning classifiers with extensive feature sets. Our feature-based methodology implements a variety of information sources including lexical, semantic, syntactic, sentiment features, as well as two new data-driven features to model common sense. Based on patterns in the syntactic structure of tweets, we aim to model the presence of contrasting sentiments, a phenomenon that is known to be indicative of verbal irony and sarcasm. Feature selection, as well as voting ensemble techniques were implemented to enhance the classification performance. The final systems reach F1-scores up to 0.79, which are promising results for a task as difficult as irony detection. Besides a quantitative analysis, this paper also describes a thorough qualitative analysis of the system output. Although lexical cues appear to be very important to express irony, our analysis also revealed the need for more advanced modeling of common-sense knowledge to detect more subtle examples of irony. Keywords Irony detection · Sarcasm detection · Implicit sentiment modeling · Computational linguistics · Natural language processing · Machine learning · Neural networks · Language models · Social media * Aaron Maladry Els Lefever Cynthia Van Hee Véronique Hoste 1 LT3, Ghent University, Groot‑Brittanniëlaan 45, Ghent 9000, Belgium 13 Vol.:(0123456789) A. Maladry et al. 1 Introduction Although direct, clear and unambiguous language is highly praised in scientific literature, people are not always straightforward in their day-to-day communication and social interactions. People use figurative language for all kinds of creative purposes, be it to nuance or emphasize what they are saying, disguise their intentions or for sheer fun. By willfully violating Grice’s conversational maxims (Grice, 1975) and providing literally wrong or useless information, people indicate that an utterance should be interpreted figuratively because there is a different implicit meaning. Figurative language is especially common on social media, where people are at liberty to express their thoughts as they like, for instance by using figurative speech. Although irony is a well-known and common example of figurative language, recognizing and understanding it remains a complex task. Hence it has been a popular research topic in the domains of linguistics, psycho-linguistics and, over the past decade, also in natural language processing (del Pilar Salas-Zárate et al., 2020). When people say something ironically, they do not intend to convey the literal meaning of an utterance, but rather something else (usually the exact opposite). Observe the following example: Example 1 Aaah, don’t you just love that awesome feeling when you stub your stupid toe against the table :) The lexical cues at the start of the utterance (“aaah, don’t you just love”) already give away that the tweet is intended ironically. However, even without these cues, one can still tell that nobody could be genuinely happy about stubbing their toe. This is a typical case of verbal irony, where irony as a figure of speech is realized in text1. In verbal irony, a person often expresses an exaggerated positive sentiment about an unpleasant or painful situation. This negative situation can be considered the “target” of the ironic evaluation, hence from now on, we will use the term “irony targets” to refer to these situations. Such a contrast between the sentiment of an evaluation on the one hand, and the implied or underlying sentiment of the target on the other hand, is known to be an important indicator of irony (Riloff et al., 2013). In some cases, this sentiment contrast is clear in the text because the target is described with words that are inherently linked to a clear sentiment (such as “stupid” in this case). In many other cases, such explicit sentiment words are not necessarily present in the text and the reader needs common-sense knowledge to understand the implicit sentiment of the target in order to determine the sentiment contrast and consequently recognize the irony. As humans, we know which situations or events are pleasant or not because we likely experienced them ourselves. However, connecting this common-sense knowledge to a string of text is not trivial from a computational point of view, as language grows and continuously adapts to reflect our society or 1 Based on previous research (Wilson & Sperber, 2012; Sulis et al., 2016) we consider sarcasm to be a specification of verbal irony, where sarcasm has a stronger negative connotation and is intended to ridicule, insult or hurt someone. 13 The limitations of irony detection in Dutch social media culture and because word connotations may be context-dependent. Combining strings into a longer sequence for instance, can alter the meaning of its constituents, e.g.: “walking your dog” generally has a positive connotation, while a longer target such as “walking your dog in the rain” becomes negative instead. In the previous paragraphs, we explained how irony is verbalized from an intuitive human perspective. But to what extent can automatic systems detect irony in Dutch social media texts and which challenges still remain? Related research for irony detection almost exclusively focuses on English, but recently, the scope has extended to include more languages, such as French, German, Italian (Cignarella et al., 2020) and Arabic (Farha et al., 2022). However, it seems that the state of the art for low-resource languages such as Dutch still lags behind. Focusing on Dutch not only allows us to investigate to what extent methodologies for English can be ported to Dutch, but it also helps diversify the pool of researched languages and allows for more comparative future research. To answer our main research question, we conducted an exhaustive set of experiments for Dutch irony detection using transformer-based architectures relying on text embeddings (Sect. 4) and SVM classifiers with a wide variety of features (Sect. 5). Both architectures are optimized and combined into an ensemble (Sect. 6). In addition, we present a novel approach to model implicit sentiment by detecting syntactic structures as irony targets and predicting their prototypical sentiment (Sect. 7). Besides a quantitative analysis, this paper also presents a thorough qualitative analysis of the output of the systems (Sect. 8). Through this manual evaluation, we gained insights on the performance, strengths and weaknesses of the different models. Finally, we summarize our find (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007/s10579-023-09656-1.pdf
Article home page: https://link.springer.com/article/10.1007/s10579-023-09656-1

Maladry, Aaron, Lefever, Els, Van Hee, Cynthia, Hoste, Véronique. The limitations of irony detection in Dutch social media, Language Resources and Evaluation, 2023, pp. 1-32, DOI: 10.1007/s10579-023-09656-1