Dual-channel feature fusion network for sheep diseases question classification (pdf)

Article PDF cannot be displayed. You can download it here:

https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0343990&type=printable

Dual-channel feature fusion network for sheep diseases question classification

RESEARCH ARTICLE Dual-channel feature fusion network for sheep diseases question classification Gulizada Haisa ☯ *, Gulimila Kezierbieke☯ College of Computer and Information Engineering, Xinjiang Agricultural University, Urumqi, China ☯ These authors contributed equally to this work. * Abstract OPEN ACCESS Citation: Haisa G, Kezierbieke G (2026) Dual-channel feature fusion network for sheep diseases question classification. PLoS One 21(3): e0343990. https://doi.org/10.1371/ journal.pone.0343990 Editor: Yasin ALTAY, Eskisehir Osmangazi University: Eskisehir Osmangazi Universitesi, TÜRKIYE Received: October 7, 2025 To address the challenges of feature sparsity, semantic ambiguity, and insufficient feature extraction in sheep disease question classification, this paper proposes a novel model named Dual-Channel Feature Fusion Network for Sheep Diseases Question Classification (DFF-SDQC). The model leverages the CINO pre-trained model to generate dynamic word embeddings, thereby enriching semantic representations. Subsequently, global textual features are captured through BiLSTM, while deeper local contextual features are extracted using an attention mechanism. To further enhance the robustness and generalization of the model, a question-word attention mechanism is introduced, enabling the attention matrix to better capture the intentions expressed by interrogative words, thus strengthening the overall feature representation of the question. Finally, dual-channel feature information is fused to obtain the final textual representation. Experimental results on the D-SDQC and D-TQC datasets show that DFF-SDQC achieves an F1-score of 93.18% on D-SDQC, improving 2.22 percentage points over the strongest baseline, demonstrating the effectiveness of the dual-channel fusion and attention design. Accepted: February 14, 2026 Published: March 30, 2026 1. Introduction Copyright: © 2026 Haisa, Kezierbieke. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Question classification plays a pivotal role in question answering (QA) systems, as it enables mapping user queries to predefined categories for efficient answer retrieval [1]. Existing research on question classification has primarily focused on real-time queries such as temporal, entity-based, and descriptive questions. For example, the TREC conference [2] has long emphasized fact-based question classification. In contrast, domain-specific QA tasks involve not only factual queries but also large amounts of professional, knowledge-intensive questions. In high-resource languages such as English [3] and Chinese [4], numerous studies have been proposed to improve the effectiveness of question classification models. While existing research has achieved considerable progress in high-resource languages such as English and Chinese, low-resource languages like Kazakh remain underexplored, facing Data availability statement: Data cannot be shared publicly because the data are part of an ongoing veterinary question-answering research project and the complete dataset is still being curated. Data are available from the research team at Xinjiang Agricultural PLOS One | https://doi.org/10.1371/journal.pone.0343990 March 30, 2026 1 / 19 University (contact via ) for researchers who meet the criteria for access to confidential data. Funding: This study was financially supported by the Xinjian Tianchi Elite Project in the form of a grant awarded to GH (6661045/2225ZZQRCXM). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. No additional external funding was received for this study. Competing interests: Not applicable. challenges such as data sparsity, morphological complexity, and insufficient corpus resources. To address these challenges, we propose the Dual-Channel Feature Fusion Network for Sheep Diseases Question Classification (DFF-SDQC), tailored for the veterinary domain. Unlike traditional approaches that struggle with sparse and ambiguous short-text representations, DFF-SDQC integrates dynamic word embeddings from a pre-trained CINO model with bidirectional sequence features from BiLSTM. In addition, a novel question-word attention mechanism is designed to capture the intent expressed by interrogatives, while an enhanced convolutional layer strengthens global and contextual feature representations. Furthermore, we construct a domain-specific dataset for sheep disease question classification in Kazakh, covering 32 distinct question types. Experimental results show that DFF-SDQC consistently outperforms competitive baselines, highlighting its robustness and generalization ability in both low-resource and specialized domains. The remainder of this paper is structured as follows. The subsequent section provides an overview of related work. Section 3 presents our proposed model. Section 4 outlines our experimental setup, including datasets, baselines, implementation details, experimental results, and analysis. Finally, Section 5 concludes the paper and discusses future directions. 2. Related work There are a lot of question classification tasks and approaches, and we brieﬂy review the most widely-used methods in this paper. Question classification is a fundamental text classification task with significant applications in natural language processing (NLP) fields such as question answering (QA) and dialogue systems. The earliest approaches were rule-based, where a predefined set of rules guided the extraction of semantic information from text. While these methods could achieve satisfactory classification results, they required extensive handcrafted rules and exhibited poor generalization. For example, Hovy et al.[5] employed rule-based strategies to represent text with handcrafted rules for classification, and Brill et al. [6] applied regular expressions for text classification. However, such approaches were inherently limited by subjective human judgments. Traditional machine learning methods alleviated the dependence on handcrafted rules by leveraging larger corpora or iterative optimization, though they still required large-scale annotated datasets. Metzler et al.[7] applied radial basis kernel functions combined with multiple feature fusion techniques for English question classification, while Zhang et al.[8] introduced tree kernels to allow support vector machines to exploit syntactic structures of questions. Nevertheless, these methods still relied heavily on manual feature engineering, which constrained their scalability and efficiency. With the rapid advancement of NLP, many scholars have turned to deep learning techniques to enhance the performance of question classification [9]. Kim et al.[10] proposed a convolutional neural network (CNN)-based sentence classific (...truncated)