Dependency-based Siamese long short-term memory network for learning sentence representations

PLOS ONE, Nov 2019

Textual representations play an important role in the field of natural language processing (NLP). The efficiency of NLP tasks, such as text comprehension and information extraction, can be significantly improved with proper textual representations. As neural networks are gradually applied to learn the representation of words and phrases, fairly efficient models of learning short text representations have been developed, such as the continuous bag of words (CBOW) and skip-gram models, and they have been extensively employed in a variety of NLP tasks. Because of the complex structure generated by the longer text lengths, such as sentences, algorithms appropriate for learning short textual representations are not applicable for learning long textual representations. One method of learning long textual representations is the Long Short-Term Memory (LSTM) network, which is suitable for processing sequences. However, the standard LSTM does not adequately address the primary sentence structure (subject, predicate and object), which is an important factor for producing appropriate sentence representations. To resolve this issue, this paper proposes the dependency-based LSTM model (D-LSTM). The D-LSTM divides a sentence representation into two parts: a basic component and a supporting component. The D-LSTM uses a pre-trained dependency parser to obtain the primary sentence information and generate supporting components, and it also uses a standard LSTM model to generate the basic sentence components. A weight factor that can adjust the ratio of the basic and supporting components in a sentence is introduced to generate the sentence representation. Compared with the representation learned by the standard LSTM, the sentence representation learned by the D-LSTM contains a greater amount of useful information. The experimental results show that the D-LSTM is superior to the standard LSTM for sentences involving compositional knowledge (SICK) data.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0193919&type=printable

Dependency-based Siamese long short-term memory network for learning sentence representations

March Dependency-based Siamese long short-term memory network for learning sentence representations Wenhao Zhu 0 1 Tengjun Yao 0 1 Jianyue Ni 0 1 Baogang Wei 1 Zhiguo Lu 1 0 School of Computer Engineering and Science, Shanghai University , Shanghai , China , 2 College of Computer Science and Technology, Zhejiang University , Zhejiang , China , 3 Library of Shanghai University, Shanghai University , Shanghai , China 1 Editor: Xuchu Weng, Hangzhou Normal University , CHINA Textual representations play an important role in the field of natural language processing (NLP). The efficiency of NLP tasks, such as text comprehension and information extraction, can be significantly improved with proper textual representations. As neural networks are gradually applied to learn the representation of words and phrases, fairly efficient models of learning short text representations have been developed, such as the continuous bag of words (CBOW) and skip-gram models, and they have been extensively employed in a variety of NLP tasks. Because of the complex structure generated by the longer text lengths, such as sentences, algorithms appropriate for learning short textual representations are not applicable for learning long textual representations. One method of learning long textual representations is the Long Short-Term Memory (LSTM) network, which is suitable for processing sequences. However, the standard LSTM does not adequately address the primary sentence structure (subject, predicate and object), which is an important factor for producing appropriate sentence representations. To resolve this issue, this paper proposes the dependency-based LSTM model (D-LSTM). The D-LSTM divides a sentence representation into two parts: a basic component and a supporting component. The D-LSTM uses a pre-trained dependency parser to obtain the primary sentence information and generate supporting components, and it also uses a standard LSTM model to generate the basic sentence components. A weight factor that can adjust the ratio of the basic and supporting components in a sentence is introduced to generate the sentence representation. Compared with the representation learned by the standard LSTM, the sentence representation learned by the DLSTM contains a greater amount of useful information. The experimental results show that the D-LSTM is superior to the standard LSTM for sentences involving compositional knowledge (SICK) data. - Funding: The work of this paper is supported by National Natural Science Foundation of China (No. 61572434 and No. 61303097) to WZ. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist. Introduction Learning textual representations is a vital part of natural language processing (NLP) and important for subsequent NLP tasks. Recently, the study of representations of phrases and sentences has attracted the attention of many researchers, who have achieved a degree of success [ 1 ]. Studies of short textual representations have attained a number of achievements, and Miklov's continuous bag of words (CBOW) model and the skip-gram model (continuous skipgram model) are among the most famous models. The word representations learned from these models present a relatively good performance in many NLP tasks, including word analogies [ 2, 3 ]. Recently, interests have shifted towards extensions of these ideas beyond the individual word-level to larger bodies of text, such as sentences. Researchers hope to directly learn sentence representation via the sum or average based on the word representation, and they have achieved satisfactory results for certain simple NLP tasks [4]. Because of the variable length and complex structure of sentences, these simple algorithms cannot handle complex tasks (such as evaluating the similarity between two sentences). To resolve this problem, Kiros, Tai and Le have proposed methods of learning fixed-length sentence representations [5±7]. Among all models for learning sentence representations, recurrent neural network (RNN) models, especially the Long Short-Term Memory (LSTM) model [ 8 ], are among the most appropriate models for processing sentences, and they have achieved substantial success in text categorization [ 9 ] and machine translation [ 10 ]. Therefore, this paper has also introduced LSTM networks into a dependency-based Siamese LSTM model (D-LSTM) for better performance. In this paper, a sentence is composed of two parts, namely, the basic component and the supporting component. We have improved upon the traditional method, which employs standard LSTM to learn sentence representations, and proposed the D-LSTM, which is based on sentence dependency to learn sentence representations. The D-LSTM can read sentences with different lengths to generate fixed-length representations. The basic component, which contain (...truncated)


This is a preview of a remote PDF: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0193919&type=printable

Wenhao Zhu, Tengjun Yao, Jianyue Ni, Baogang Wei, Zhiguo Lu. Dependency-based Siamese long short-term memory network for learning sentence representations, PLOS ONE, 2018, Volume 13, Issue 3, DOI: 10.1371/journal.pone.0193919