Detecting Patterns of Intimate Partner Violence Using Qualitative Analyses and Machine Learning Algorithms
Prevention Science
https://doi.org/10.1007/s11121-026-01923-1
Detecting Patterns of Intimate Partner Violence Using Qualitative
Analyses and Machine Learning Algorithms
Ying Zhang1,2
· Jun Fang3,4 · Ambika Krishnakumar2
Received: 24 February 2025 / Accepted: 13 April 2026
© The Author(s) 2026
Abstract
Intimate partner violence (IPV) survivors increasingly use social media platforms to share their experiences and to seek help
and support for their IPV-related concerns. IPV evidence extracted from social media platforms can provide valuable information and complement data obtained from conventional data sources (e.g., self-reports and interviews) thereby enhancing our
understanding of IPV victimization. This study addressed three research questions: (1) What range of IPV behaviors emerge
through qualitative coding? (2) To what extent do machine learning (ML) based text classifications yield results comparable
to qualitative coding of IPV behaviors? and (3) Do the conceptualizations that emerge from unsupervised ML capture additional behaviors or contextual information not identified through qualitative analyses? We analyzed 400 posts from women
on IPV-related online forums using qualitative content analysis and two ML approaches: supervised text classification and
unsupervised topic modeling (Latent Dirichlet Allocation). Supervised learning approaches, notably Random Forest and
Neural Networks, proved effective in classifying IPV violence subtypes with high accuracy (F1 scores .62 – .85). A comparison of findings from the qualitative and topic modeling approaches supported the presence of distinct characteristics of
IPV: physical and sexual violence, psychological/emotional abuse, and coercive control. The ML model revealed vocabulary
patterns consistent with relational and child-related contexts, temporal and frequency indicators of violence, references to
legal system engagement, and spatial contexts, elements that were less captured through thematic qualitative coding alone.
The consistency of findings across qualitative and ML approaches points to the potential of leveraging ML techniques when
analyzing qualitative data, thus enabling the development of timely and effective IPV interventions.
Keywords Intimate partner violence (IPV) · Machine learning (ML) · Text mining · Social media analysis
Introduction
Intimate partner violence (IPV) is a serious public health
problem that constitutes a violation of fundamental human
rights and threatens the health and well-being of individuals,
families, and communities (UN General Assembly, 1993).
Violence by an intimate partner or spouse in marital, dating,
* Ying Zhang
1
Department of Psychology, Clarkson University, Potsdam,
NY, USA
2
Department of Human Development and Family Science,
Syracuse University, Syracuse, NY, USA
3
Department of Psychology, Syracuse University, Syracuse,
NY, USA
4
Department of Psychological and Brain Sciences, Indiana
University, Bloomington, IN, USA
cohabiting, or other romantic relationships can include a
range of abusive acts such as physical violence, sexual coercion, psychological abuse, and coercive controlling behaviors (Krug et al., 2002). Globally, 1 in 3 women between
the ages of 15–49 years report physical and/or sexual violence from a current or former intimate partner (Sardinha
et al., 2022). Although IPV is an issue of grave concern that
disproportionately impacts women, its negative effects cut
across race, age, socioeconomic, gender, sexual identity, and
relationship status (Renner & Whitney, 2010). Despite the
health, social, economic, and quality of life costs for women,
many women victims underreport, misreport, or deny their
experiences of violence to service providers, medical professionals, and law enforcement officers for fear of reprisal,
shame, fear, stigma, and/or because of an inability to identify
their experiences as violence (Boethius & Åkerström, 2020;
Overstreet & Quinn, 2013).
Vol.:(0123456789)
Prevention Science
IPV researchers have mostly depended on information
from surveys, court records, shelters, medical records, and
police reports to document the prevalence and characteristics of IPV (Ellsberg et al., 2001). Data from these sources
have mostly been analyzed using qualitative and quantitative
approaches which have been used by researchers to analyze
IPV information (Li et al., 2022; Sardinha et al., 2022; Testa
et al., 2011). Although qualitative methods are particularly
informative and provide rich information, the exhaustive
aspects of qualitative data collection (e.g., focus groups,
open-ended survey responses, interview transcripts, observations) along with other extensive research procedures (e.g.,
transcribing, reading extensive and complex textual data,
developing meaningful codes and themes, assessing interrater reliability, and writing in-depth descriptive reports)
have proven to be labor-intensive for researchers (Sutton &
Simons, 2015; Thomas et al., 2014).
In recent years, with the advent of social media (e.g.,
Facebook, Twitter, Reddit, etc.), researchers have access to
the narratives posted by IPV survivors about their experiences of violence (Guo et al., 2023). In this paper, we intend
to use narrative data from social media posts by IPV survivors to first conduct qualitative analysis followed by the use
of supervised and unsupervised ML algorithms (Arias &
Fabian, 2021; Chiong et al., 2021) (Supervised Text Classification Process) to access and process IPV narratives. A
“text mining” approach that utilizes natural language processing techniques to analyze the semantics, grammatical,
linguistic, and syntactic patterns of transcripts will be used
to prepare the data for analysis (Allahyari et al., 2017). Next,
after identifying the functioning of the best-performing classifier using the supervised text classification process, unsupervised ML (Latent Dirichlet Allocation (LDA) algorithm,
i.e., topic modeling) will be used to automate the classification process. Finally, findings from both approaches (qualitative and unsupervised ML) will be compared to assess the
potential of machine learning approaches to analyze narratives from social media posts.
Literature Review
The conceptualization of IPV used in this study was derived
from the extant IPV research literature and includes the definition outlined by the World Health Organization (WHO),
which characterizes physical violence as encompassing
behaviors like “slapping, hitting, kicking, and beating,” sexual violence that includes behaviors such as “forced sexual
intercourse and other forms of sexual coercion,” and psychological/emotional violence that includes actions such as
“insults, belittling, constant humiliation, intimidation (e.g.,
destroying things), threats of harm, and threats to take away
children” (World Health Organization, 2012). We were also
informed by various measures such as the Revised Conflict
Tactics Scale (CTS2) (Straus et al., 1996) to assess IPV.
In (...truncated)