Detecting Patterns of Intimate Partner Violence Using Qualitative Analyses and Machine Learning Algorithms (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s11121-026-01923-1.pdf

Detecting Patterns of Intimate Partner Violence Using Qualitative Analyses and Machine Learning Algorithms

Prevention Science https://doi.org/10.1007/s11121-026-01923-1 Detecting Patterns of Intimate Partner Violence Using Qualitative Analyses and Machine Learning Algorithms Ying Zhang1,2 · Jun Fang3,4 · Ambika Krishnakumar2 Received: 24 February 2025 / Accepted: 13 April 2026 © The Author(s) 2026 Abstract Intimate partner violence (IPV) survivors increasingly use social media platforms to share their experiences and to seek help and support for their IPV-related concerns. IPV evidence extracted from social media platforms can provide valuable information and complement data obtained from conventional data sources (e.g., self-reports and interviews) thereby enhancing our understanding of IPV victimization. This study addressed three research questions: (1) What range of IPV behaviors emerge through qualitative coding? (2) To what extent do machine learning (ML) based text classifications yield results comparable to qualitative coding of IPV behaviors? and (3) Do the conceptualizations that emerge from unsupervised ML capture additional behaviors or contextual information not identified through qualitative analyses? We analyzed 400 posts from women on IPV-related online forums using qualitative content analysis and two ML approaches: supervised text classification and unsupervised topic modeling (Latent Dirichlet Allocation). Supervised learning approaches, notably Random Forest and Neural Networks, proved effective in classifying IPV violence subtypes with high accuracy (F1 scores .62 – .85). A comparison of findings from the qualitative and topic modeling approaches supported the presence of distinct characteristics of IPV: physical and sexual violence, psychological/emotional abuse, and coercive control. The ML model revealed vocabulary patterns consistent with relational and child-related contexts, temporal and frequency indicators of violence, references to legal system engagement, and spatial contexts, elements that were less captured through thematic qualitative coding alone. The consistency of findings across qualitative and ML approaches points to the potential of leveraging ML techniques when analyzing qualitative data, thus enabling the development of timely and effective IPV interventions. Keywords Intimate partner violence (IPV) · Machine learning (ML) · Text mining · Social media analysis Introduction Intimate partner violence (IPV) is a serious public health problem that constitutes a violation of fundamental human rights and threatens the health and well-being of individuals, families, and communities (UN General Assembly, 1993). Violence by an intimate partner or spouse in marital, dating, * Ying Zhang 1 Department of Psychology, Clarkson University, Potsdam, NY, USA 2 Department of Human Development and Family Science, Syracuse University, Syracuse, NY, USA 3 Department of Psychology, Syracuse University, Syracuse, NY, USA 4 Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN, USA cohabiting, or other romantic relationships can include a range of abusive acts such as physical violence, sexual coercion, psychological abuse, and coercive controlling behaviors (Krug et al., 2002). Globally, 1 in 3 women between the ages of 15–49 years report physical and/or sexual violence from a current or former intimate partner (Sardinha et al., 2022). Although IPV is an issue of grave concern that disproportionately impacts women, its negative effects cut across race, age, socioeconomic, gender, sexual identity, and relationship status (Renner & Whitney, 2010). Despite the health, social, economic, and quality of life costs for women, many women victims underreport, misreport, or deny their experiences of violence to service providers, medical professionals, and law enforcement officers for fear of reprisal, shame, fear, stigma, and/or because of an inability to identify their experiences as violence (Boethius & Åkerström, 2020; Overstreet & Quinn, 2013). Vol.:(0123456789) Prevention Science IPV researchers have mostly depended on information from surveys, court records, shelters, medical records, and police reports to document the prevalence and characteristics of IPV (Ellsberg et al., 2001). Data from these sources have mostly been analyzed using qualitative and quantitative approaches which have been used by researchers to analyze IPV information (Li et al., 2022; Sardinha et al., 2022; Testa et al., 2011). Although qualitative methods are particularly informative and provide rich information, the exhaustive aspects of qualitative data collection (e.g., focus groups, open-ended survey responses, interview transcripts, observations) along with other extensive research procedures (e.g., transcribing, reading extensive and complex textual data, developing meaningful codes and themes, assessing interrater reliability, and writing in-depth descriptive reports) have proven to be labor-intensive for researchers (Sutton & Simons, 2015; Thomas et al., 2014). In recent years, with the advent of social media (e.g., Facebook, Twitter, Reddit, etc.), researchers have access to the narratives posted by IPV survivors about their experiences of violence (Guo et al., 2023). In this paper, we intend to use narrative data from social media posts by IPV survivors to first conduct qualitative analysis followed by the use of supervised and unsupervised ML algorithms (Arias & Fabian, 2021; Chiong et al., 2021) (Supervised Text Classification Process) to access and process IPV narratives. A “text mining” approach that utilizes natural language processing techniques to analyze the semantics, grammatical, linguistic, and syntactic patterns of transcripts will be used to prepare the data for analysis (Allahyari et al., 2017). Next, after identifying the functioning of the best-performing classifier using the supervised text classification process, unsupervised ML (Latent Dirichlet Allocation (LDA) algorithm, i.e., topic modeling) will be used to automate the classification process. Finally, findings from both approaches (qualitative and unsupervised ML) will be compared to assess the potential of machine learning approaches to analyze narratives from social media posts. Literature Review The conceptualization of IPV used in this study was derived from the extant IPV research literature and includes the definition outlined by the World Health Organization (WHO), which characterizes physical violence as encompassing behaviors like “slapping, hitting, kicking, and beating,” sexual violence that includes behaviors such as “forced sexual intercourse and other forms of sexual coercion,” and psychological/emotional violence that includes actions such as “insults, belittling, constant humiliation, intimidation (e.g., destroying things), threats of harm, and threats to take away children” (World Health Organization, 2012). We were also informed by various measures such as the Revised Conflict Tactics Scale (CTS2) (Straus et al., 1996) to assess IPV. In (...truncated)