Artificial Intelligence-Generated Text in Higher Education – Usage and Detection in the Literature (pdf)

Article PDF cannot be displayed. You can download it here:

https://indecs.eu/2024/indecs2024-pp238-245.pdf

Artificial Intelligence-Generated Text in Higher Education – Usage and Detection in the Literature

Interdisciplinary Description of Complex Systems 22(3), 238-245, 2024 ARTIFICIAL INTELLIGENCE-GENERATED TEXT IN HIGHER EDUCATION – USAGE AND DETECTION IN THE LITERATURE László Berek* Óbuda University, University Library Budapest, Hungary DOI: 10.7906/indecs.22.3.1 Regular article Received: 20 May 2024. Accepted: 15 June 2024. ABSTRACT Since ChatGPT launch in November 2022, artificial intelligence has become more and more widespread in all areas of life. Generative applications of artificial intelligence are proliferating in a wide range of fields. The technology has great potential for applications such as machine translation, voice recognition, education, or content creation, but it also raises concerns about misuse, ethical use, and plagiarism. As texts generated by artificial intelligence tools continue to improve, detection tools on the market will have to involve additional efforts to keep pace. This article uses data from the Scopus and Web of Science databases to map the current usability of detectors, of texts generated by artificial intelligence, in higher education and academia. One of the aims of the article is to provide an insight into the experiences with currently available detectors of texts generated by artificial intelligence in higher education. KEY WORDS artificial intelligence, AI-generated text detector, academic integrity, plagiarism, higher education CLASSIFICATION ACM: I.2.0, I.2.6, I.2.7 APA: 3550 JEL: I21 *Corresponding author, : ; +36 (1) 666-55976; *Óbuda University, University Library, Bécsi út 96/B, H-1034 Budapest, Hungary Artificial intelligence-generated text in higher education – usage and detection in ... INTRODUCTION Artificial intelligence (AI) and Natural Language Processing (NLP) have made significant progress over the past decade, and in the last few years, the solutions and opportunities offered by new technology have spread to all areas of life. In addition to many other areas, generative AI for textual content is of course making huge strides forward. The new technology offers a great potential for applications such as machine translation, voice recognition, education, or content creation, but it also raises concerns about misuse, ethical use, and plagiarism. In recent decades, higher education institutions have made great strides towards detecting plagiarism violations by students and researchers, with the help of the increasingly improved plagiarism detection systems available on the market. In many universities, plagiarism checks are a requirement as part of the education system for students’ midterm papers, theses, and dissertations. At the Óbuda University, for example, a plagiarism detector has been part of the institutional repository under the control of the University Library since 2011. Its use is not only to check the students’ theses, but also to check plagiarism in the university’s journals and other publications. [1, 2]. In the literature, the use of AI-generated text is commonly confused with plagiarism or is part of the concept of plagiarism. On the one hand, it is understandable that we are talking about some kind of unconscious plagiarism, whereby the generative AI creates the text using available, previously published works, but in most cases the reference of the sources used by the AI is not visible in the final result. (of course there are exceptions, platforms, and systems where the insertion of the appropriate reference is a function of the software) In research on AI-generated writing, the phenomenon is often referred to as patchwriting or cryptomnesia. The research focuses on the conceptual definition of the phenomenon [3]. To create an AI-generated text, systems use huge amounts of text and other data available online (online contents, books, journals, webpages...). By recognising and further learning language patterns, relations, and contexts, they can evolve to create content similar to the original human-written texts in their datasets. This is where the problem begins, these generated texts are often not easily identifiable as generated text to the human eye. With the rapid advances in technology and the learning process, it is predictable that this will lead to ever more improved texts in the future. The rise and use of generative AI in higher education is shown in the BestColleges survey conducted in autumn 2023. The survey included 1000 respondents who are currently studying at a university or college in the US. Students were asked to answer several questions related to AI use. 56% of students reported that they had already used an AI tool to complete assignments. In addition, 54% of respondents agreed with the statement that using AI to complete assignments is cheating or plagiarism.[4] The percentage of responses to this question is shown in Figure 1. A survey conducted six months earlier (March 2023), also by BestColleges, also asked whether students use AI tools to solve problems. The rapid development of the use of AI tools is shown by the fact that six months earlier, only 22% of students answered yes to the question [5]. The development of artificial intelligence, and in particular generative AI, can be predicted for the coming years and decades. Bloomberg’s Autumn 2023 forecast shows the evolution of the generative AI market between 2020 and 2032. The market has grown from $14 billion U.S. in 2020 to $900 billion U.S. in 2023. The forecast is shown in Figure 2 [6]. 239 L. Berek 25% 54% 21% Yes No Neutral Figure 1. Using AI Tools to Complete Assignments or Exams is Cheating or Plagiarism| BestColleges 2023 [4]. Figure 2. Generative AI revenue worldwide from 2020 with forecast until 2032 (in billion U.S. dollars) [6]. The literature review focuses on the role of generative AI in higher education institutions and academia. A review of the research results is presented to explore the effectiveness of AI generated text detectors. The research also focuses on the regulation of generative AI in higher education. MATERIALS AND METHODS The two major scientific databases used for the bibliographic search were Scopus and Web of Science. Zotero reference management software was used for data collection and further processing. Rayyan software was used for the deduplication of publications and for the screening and selection stage. 240 Artificial intelligence-generated text in higher education – usage and detection in ... CRITERIA AND LIMITATIONS The main data source for the study was Scopus; the data collected was supplemented by the results of a search of the Web of Science database. Additional data, mainly statistical, were collected from the Statista database. Search queries were conducted in May 2024 in both Scopus and Web of Science databases. The search in Scopus and Web of Science did not exclude conference proceedings or book chapters. All content indexed in these databases were included in the analysis. Several keywords were specified in order to identify relev (...truncated)