The limitations of irony detection in Dutch social media
Language Resources and Evaluation
https://doi.org/10.1007/s10579-023-09656-1
ORIGINAL PAPER
The limitations of irony detection in Dutch social media
Aaron Maladry1 · Els Lefever1 · Cynthia Van Hee1 · Véronique Hoste1
Accepted: 27 March 2023
© The Author(s) 2023
Abstract
In this paper, we explore the feasibility of irony detection in Dutch social media. To
this end, we investigate both transformer models with embedding representations,
as well as traditional machine learning classifiers with extensive feature sets. Our
feature-based methodology implements a variety of information sources including lexical, semantic, syntactic, sentiment features, as well as two new data-driven
features to model common sense. Based on patterns in the syntactic structure of
tweets, we aim to model the presence of contrasting sentiments, a phenomenon that
is known to be indicative of verbal irony and sarcasm. Feature selection, as well as
voting ensemble techniques were implemented to enhance the classification performance. The final systems reach F1-scores up to 0.79, which are promising results for
a task as difficult as irony detection. Besides a quantitative analysis, this paper also
describes a thorough qualitative analysis of the system output. Although lexical cues
appear to be very important to express irony, our analysis also revealed the need for
more advanced modeling of common-sense knowledge to detect more subtle examples of irony.
Keywords Irony detection · Sarcasm detection · Implicit sentiment modeling ·
Computational linguistics · Natural language processing · Machine learning · Neural
networks · Language models · Social media
* Aaron Maladry
Els Lefever
Cynthia Van Hee
Véronique Hoste
1
LT3, Ghent University, Groot‑Brittanniëlaan 45, Ghent 9000, Belgium
13
Vol.:(0123456789)
A. Maladry et al.
1 Introduction
Although direct, clear and unambiguous language is highly praised in scientific literature, people are not always straightforward in their day-to-day communication and
social interactions. People use figurative language for all kinds of creative purposes,
be it to nuance or emphasize what they are saying, disguise their intentions or for
sheer fun. By willfully violating Grice’s conversational maxims (Grice, 1975) and
providing literally wrong or useless information, people indicate that an utterance
should be interpreted figuratively because there is a different implicit meaning. Figurative language is especially common on social media, where people are at liberty to
express their thoughts as they like, for instance by using figurative speech. Although
irony is a well-known and common example of figurative language, recognizing and
understanding it remains a complex task. Hence it has been a popular research topic
in the domains of linguistics, psycho-linguistics and, over the past decade, also in
natural language processing (del Pilar Salas-Zárate et al., 2020).
When people say something ironically, they do not intend to convey the literal
meaning of an utterance, but rather something else (usually the exact opposite).
Observe the following example:
Example 1 Aaah, don’t you just love that awesome feeling when you stub your stupid toe against the table :)
The lexical cues at the start of the utterance (“aaah, don’t you just love”) already
give away that the tweet is intended ironically. However, even without these cues,
one can still tell that nobody could be genuinely happy about stubbing their toe.
This is a typical case of verbal irony, where irony as a figure of speech is realized
in text1. In verbal irony, a person often expresses an exaggerated positive sentiment
about an unpleasant or painful situation. This negative situation can be considered
the “target” of the ironic evaluation, hence from now on, we will use the term “irony
targets” to refer to these situations. Such a contrast between the sentiment of an evaluation on the one hand, and the implied or underlying sentiment of the target on the
other hand, is known to be an important indicator of irony (Riloff et al., 2013).
In some cases, this sentiment contrast is clear in the text because the target is
described with words that are inherently linked to a clear sentiment (such as “stupid”
in this case). In many other cases, such explicit sentiment words are not necessarily
present in the text and the reader needs common-sense knowledge to understand the
implicit sentiment of the target in order to determine the sentiment contrast and consequently recognize the irony. As humans, we know which situations or events are
pleasant or not because we likely experienced them ourselves. However, connecting
this common-sense knowledge to a string of text is not trivial from a computational
point of view, as language grows and continuously adapts to reflect our society or
1
Based on previous research (Wilson & Sperber, 2012; Sulis et al., 2016) we consider sarcasm to be a
specification of verbal irony, where sarcasm has a stronger negative connotation and is intended to ridicule, insult or hurt someone.
13
The limitations of irony detection in Dutch social media
culture and because word connotations may be context-dependent. Combining
strings into a longer sequence for instance, can alter the meaning of its constituents,
e.g.: “walking your dog” generally has a positive connotation, while a longer target
such as “walking your dog in the rain” becomes negative instead.
In the previous paragraphs, we explained how irony is verbalized from an intuitive human perspective. But to what extent can automatic systems detect irony in
Dutch social media texts and which challenges still remain? Related research for
irony detection almost exclusively focuses on English, but recently, the scope has
extended to include more languages, such as French, German, Italian (Cignarella
et al., 2020) and Arabic (Farha et al., 2022). However, it seems that the state of the
art for low-resource languages such as Dutch still lags behind. Focusing on Dutch
not only allows us to investigate to what extent methodologies for English can be
ported to Dutch, but it also helps diversify the pool of researched languages and
allows for more comparative future research.
To answer our main research question, we conducted an exhaustive set of experiments for Dutch irony detection using transformer-based architectures relying
on text embeddings (Sect. 4) and SVM classifiers with a wide variety of features
(Sect. 5). Both architectures are optimized and combined into an ensemble (Sect. 6).
In addition, we present a novel approach to model implicit sentiment by detecting syntactic structures as irony targets and predicting their prototypical sentiment
(Sect. 7). Besides a quantitative analysis, this paper also presents a thorough qualitative analysis of the output of the systems (Sect. 8). Through this manual evaluation,
we gained insights on the performance, strengths and weaknesses of the different
models. Finally, we summarize our find (...truncated)