Tibetan interrogative sentence recognition and classification based on phrase features

MATEC Web of Conferences, Jan 2021

The recognition of Tibetan interrogative sentences is a basic work in natural language processing, which has a wide application value in terms of Tibetan syntactic analysis, semantic analysis, intelligent question answering, search engine and other research fields. Employing interrogative pronouns as a entry point to analyze the phrase features before and after interrogative pronouns, the paper proposes a method for Tibetan interrogative sentence recognition and classification based on phrase features by designing a Tibetan interrogative sentence recognition and classification model based on phrase features. Experimental results show that the recognition accuracy, recall rate and F value of this method are 98.21%, 100.00% and 99.10% respectively, and the average classification accuracy, recall rate and F value are 96.98%, 100.00% and 98.39%, respectively.

Tibetan interrogative sentence recognition and classification based on phrase features

MATEC Web of Conferences 336, 06017 (2021) CSCNS2020 https://doi.org/10.1051/matecconf/202133606017 Tibetan interrogative sentence recognition and classification based on phrase features Mabao Ban1,3,4,, Zhijie Cai1,2,3,4*, Rangzhuoma Cai1,2,3,4, and Rangjia Cai1,3,4 1College of Computer Science and Technology, Qinghai Normal University, Qinghai Xining 810016, China 2School of Computer Science and Technology, Southwest Minzu University, Sichuan Chengdu 610041, China 3Tibetan Information Processing and Machine Translation Key Laboratory of Qinghai Province, Qinghai Xining 810008, China 4Key Laboratory of Tibetan Information Processing, Ministry of Education, Qinghai Xining 810008, China Abstract. The recognition of Tibetan interrogative sentences is a basic work in natural language processing, which has a wide application value in terms of Tibetan syntactic analysis, semantic analysis, intelligent question answering, search engine and other research fields. Employing interrogative pronouns as a entry point to analyze the phrase features before and after interrogative pronouns, the paper proposes a method for Tibetan interrogative sentence recognition and classification based on phrase features by designing a Tibetan interrogative sentence recognition and classification model based on phrase features. Experimental results show that the recognition accuracy, recall rate and F value of this method are 98.21%, 100.00% and 99.10% respectively, and the average classification accuracy, recall rate and F value are 96.98%, 100.00% and 98.39%, respectively. 1 Introduction With the development of computer technology, the research of Tibetan natural language processing has gradually developed from word level to sentence level. Tibetan interrogative sentence is a common sentence pattern, and its recognition and classification is one of the key technologies in Tibetan syntactic analysis, semantic analysis, intelligent question answering, search engine and other tasks. In the recognition methods of sentences and sentence patterns, the commonly used methods are rule method, statistical method and the combination of rules and statistics, etc. There are many documents on Chinese sentence pattern recognition. Literature [1-4] employs different methods to identify and classify Chinese subjective sentences, explanatory opinion sentences, opinion sentences, and graceful sentences, all of which have achieved good experimental results. In terms of Tibetan sentence and sentence pattern recognition, because there is no obvious boundary symbol in Tibetan sentence, the current * Corresponding author: © The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0 (http://creativecommons.org/licenses/by/4.0/). MATEC Web of Conferences 336, 06017 (2021) CSCNS2020 https://doi.org/10.1051/matecconf/202133606017 research mainly focuses on sentence boundary recognition technology [5-14], which provides a theoretical basis for the study of Tibetan sentence boundary recognition. The research on Tibetan sentence pattern recognition and classification technology has not been reported. The research shows that identifying different sentence patterns and classifying them can improve the performance of question answering system. Analyzed the phrase features before and after interrogative pronouns. 2 Tibetan interrogative sentence recognition and classification based on phrase features 2.1 Tibetan interrogative sentence recognition and classification model In Tibetan written language, each interrogative sentence contains at least one interrogative pronoun with distinct structural features. Taking interrogative pronouns as the starting point, this paper designs a Tibetan interrogative sentence recognition and classification model with phrase features as shown in Fig.1. Phrase feature analysis module Interrogative word recognition Analysis of phrase features ry1 Tibetan sentence bank ry2 Feature1 TIS1 Feature2 TIS2 … ry3 Interrogative sentence recognition module Feature8 … TIS8 Fig.1. Tibetan interrogative sentence recognition and classification model based on phrase features. The Tibetan interrogative sentence recognition and classification model based on phrase features includes phrase feature analysis and question sentence recognition module. There are two parts in the phrase feature analysis module: interrogative word recognition and phrase feature analysis. In the part of interrogative word recognition, interrogative pronouns are identified by ry1, ry2, and ry3. The phrase feature analysis part obtains the phrase feature Feature1 or Feature2 or...or Feature8 of the corresponding question sentence by analyzing the phrase features before and after ry. The interrogative sentence recognition module recognizes and classifies Tibetan interrogative sentences exploits phrase characteristics. 2.2 An analysis of the features of Tibetan interrogative sentences Tibetan interrogative sentence is a sentence pattern classified according to the mood of the sentence. It is a sentence that asks others questions about the type and nature of the things in question [15-18]. Compared with declarative sentences, imperative sentences and exclamatory sentences, Tibetan interrogative sentences have obvious differences in mood and emotional color.However, the current technology can not identify interrogative sentences according to mood and emotional color. By analyzing the structural features of Tibetan interrogative sentences, we find that each interrogative sentence contains at least one interrogative word (called interrogative pronoun ry in part of speech marker set, also known as interrogative pronoun below). Tibetan interrogative pronouns are very clear and limited in number. In order to analyze the features of interrogative sentences, we divide 2 MATEC Web of Conferences 336, 06017 (2021) CSCNS2020 https://doi.org/10.1051/matecconf/202133606017 interrogative pronouns into three categories. The classification of Tibetan interrogative pronouns is shown in Table 1. Table 1. Classification of Tibetan interrogative pronouns. Serial number 1 2 3 type ry1 ry2 ry3 Interrogative pronouns གམ་ངམ་དམ་ནམ་བམ་མམ་འམ་རམ་ལམ་སམ ཅི་ཇི་�་གང་�་ནམ ཨེ In Table 1, except for "ནམ", all the others belong to one type, and there is no multicategory problem. The type of the interrogative pronoun "ནམ" can be judged according to its position and context. When it appears after the verb, adjective or auxiliary verb, it belongs to ry1, otherwise it belongs to ry2. Employ interrogative pronouns as an entry point, we analyze the grammatical structure and structural characteristics of Tibetan interrogative sentences. According to the different combination characteristics of interrogative pronouns and their contexts, we can divide them into general interrogative sentence (TIS1), emphatic interrogative sentence (TIS2), specific interrogative sentence (...truncated)


This is a preview of a remote PDF: https://www.matec-conferences.org/articles/matecconf/pdf/2021/05/matecconf_cscns20_06017.pdf
Article home page: https://doaj.org/article/65d74645bd264708a70db4f0eff4e297

Ban Mabao, Cai Zhijie, Cai Rangzhuoma, Cai Rangjia. Tibetan interrogative sentence recognition and classification based on phrase features, MATEC Web of Conferences, 2021, pp. 06017, Issue 336, DOI: 10.1051/matecconf/202133606017