Sentiment analysis of MOOC reviews via ALBERT-BiLSTM model (pdf)

Article PDF cannot be displayed. You can download it here:

https://www.matec-conferences.org/articles/matecconf/pdf/2021/05/matecconf_cscns20_05008.pdf

Sentiment analysis of MOOC reviews via ALBERT-BiLSTM model

MATEC Web of Conferences 336, 05008 (2021) CSCNS2020 Sentiment analysis of ALBERT-BiLSTM model https://doi.org/10.1051/matecconf/202133605008 MOOC reviews via Cheng Wang1, Sirui Huang2, and Ya Zhou1,* 1 Guangxi Key Lab of Trusted Software, Guilin University of Electronic Technology, 541004 Guilin Guangxi, China 2 Electronic and Electrical Engineering Department, University College London, WC1E 6BT London, UK Abstract. The accurate exploration of the sentiment information in comments for Massive Open Online Courses (MOOC) courses plays an important role in improving its curricular quality and promoting MOOC platform’s sustainable development. At present, most of the sentiment analyses of comments for MOOC courses are actually studies in the extensive sense, while relatively less attention is paid to such intensive issues as the polysemous word and the familiar word with an upgraded significance, which results in a low accuracy rate of the sentiment analysis model that is used to identify the genuine sentiment tendency of course comments. For this reason, this paper proposed an ALBERT-BiLSTM model for sentiment analysis of comments for MOOC courses. Firstly, ALBERT was used to dynamically generate word vectors. Secondly, the contextual feature vectors were obtained through BiLSTM pre-sequence and post-sequence, and the attention mechanism that could calculate the weight of different words in a sentence was applied together. Finally, the BiLSTM output vectors were input into Softmax for the classification of sentiments and prediction of the sentimental tendency. The experiment was performed based on the genuine data set of comments for MOOC courses. It was proved in the result that the proposed model was higher in accuracy rate than the already existing models. 1 Introduction With the rapid development of Internet technology, the online learning platform Massive Open Online Courses (MOOC) has attracted wide attention. Many learners leave comments in the commenting section when attending the courses. These comments not only include evaluation of the curricular quality, but also give direct feedbacks to some technical problems existing in the MOOC platform. While it is extremely difficult to apply manual methods in implementing statistics on and performing analysis of the large amounts of comments as information data, it is better to apply sentiment analysis to obtain the sentiment tendency of comment texts and to extract and explore the information from * Corresponding author: © The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0 (http://creativecommons.org/licenses/by/4.0/). MATEC Web of Conferences 336, 05008 (2021) CSCNS2020 https://doi.org/10.1051/matecconf/202133605008 massive amounts of comment data, which will be helpful not only for learners to make choices of courses and but also for platform administrators to find out certain problems. The existing sentiment analyses of comments for MOOC courses can be roughly divided into three categories. The research method based on the sentiment dictionary is an important way of analyzing textual sentiments. Araque et al. [1] indicated that the result would be influenced to a certain extent when choosing vocabulary from different data sets of different fields. Kim et al. [2] improved the accuracy rate of textual sentiment analysis, by extending the existing sentiment dictionary with the sentiment vocabulary collected manually. However, the sentiment dictionary had the defects of low universality and instantaneity. In Traditional machine learning, the task of sentiment analysis was accomplished through the method of feature engineering. Cai et al. [3] first structuralized the text waiting for procession with a sentiment dictionary, and Gradient Boost Decision Tree (GBDT) model for training and prediction, achieving a better result than using the single model. However, a high labor cost was inevitable when extracting features manually with the traditional machine learning method, so it was not suitable to be used for exploring and analyzing massive data of curricular comments at the present stage. Currently, in-depth learning has become the mainstream technology of textual sentiment analysis. Long et al. [4] analyzed comments on social media, overcame the deficiencies of the traditional sentiment analysis model and achieved a good result in numerous grammar databases of different fields. Devlin et al. [5] proposed the BERT textual pre-training model which operated well in performing the task of textual classification. It has abandoned the traditional structure of convolution and recurrent neural network (RNN), and used the Transformer structure to build the overall network model. In the pre-training process, it could learn the grammar and semantics of the language incrementally. However, when encoding the contextual information, the BERT pre-training model, based only on the attention mechanism without considering the part of speech, would be trapped in misjudgment. Google launched ALBERT model in 2019. Compared with BERT, it used fewer parameters and took up less memory, which greatly improved the training speed and accuracy. At present, there are still few sentiment analyses which were made by using ALBERT pre-training model to study the comments for MOOC courses. On this basis, we proposed the ALBERT-BiLSTM model for the sentiment analysis of comments for MOOC courses. 2 Related work BERT applied a Transformer compiler with self-attention mechanism in the whole pretraining process. As a multi-task model, it could capture the bi-directional relationship in sentences more thoroughly and realize the bi-directional learning of linguistic representation in all layers. However, BERT needed a number of parameters and took up huge resources, so BRET, a lightweight version of ALBERT, was adopted in this thesis. To a certain extent, ALBERT solved the setbacks of the BERT model in terms of multiple parameters and large-resource occupancy by adopting three methods: factorization, crosslayer parameter sharing and inter-sentence coherence. In the whole sequence modeling process, RNN could capture a long-term dependent relationship and obtain a word vector with global contextual information. As a variant of RNN, LSTM, to a certain extent, alleviated the problem of RNN in gradient disappearance, but it could only process sequence data from the forward direction, whereas it was also very important to process sequence data from the backward direction in the classification of textual sentiment. However, the basic component of BiLSTM was indeed the LSTM which was composed of forward-direction LSTM and backward-direction LSTM, so it could apply the two mutually independent hidden layers to process data from forward and backward directions simultaneously so as to obtain complete semantic information. 2 MATEC Web of Conferences 336, 0 (...truncated)