Social media platforms usually contain several modified versions of an image. This proliferation of versions questions the trust of social media images. We propose a novel framework to find modified versions of social media images using only their metadata. We consider several aspects to determine if an image is a modified version of another image. These aspects include topic of...
The expansion of Graph Neural Networks (GNNs) has highlighted the importance of evaluating their performance in real-world scenarios. However, existing evaluation frameworks often overlook the integration of causality, a critical component that is essential for more robust evaluation of GNNs. To address this gap, we present a benchmark study that systematically compares standard...
As databases become larger and less structured, the JavaScript Object Notation (JSON) data format has risen in usage compared to other data formats like XML. At the same time, while extracting data from these large datasets efficiently is of obvious importance, there has been far less research regarding the optimization of JSON queries than there has relating to the querying of...
In the Web3.0 era, which does not rely on any centralized organization and emphasizes user control, security, trustworthiness and the importance of data privacy, blockchain plays a key role. Its decentralization, security and trustworthiness and other characteristics have become Building the infrastructure of trusted interconnection and value interconnection in the Web3.0 era has...
Code-mixing and code-switching (CMCS) are prevalent phenomena observed in social media conversations and various other modes of communication. When developing applications such as sentiment analysers and hate-speech detectors that operate on this social media data, CMCS text poses challenges. Recent studies have demonstrated that prompt-based learning of pre-trained language...
Android malware has become an increasingly important threat to individuals, organizations, and society, posing significant risks to data security, privacy, and infrastructure. As malware evolves in sophistication and complexity, the detection and mitigation of these malicious software instances have become more challenging and time consuming since the required number of features...
The growing need for data publication and the escalating concerns regarding data privacy have led to a surge in interest in Privacy-Preserving Data Publishing (PPDP) across research, industry, and government sectors. Despite its significance, PPDP remains a challenging NP-hard problem, particularly when dealing with complex datasets, often rendering traditional traversal search...
The advent of large language models marks a revolutionary breakthrough in artificial intelligence. With the unprecedented scale of training and model parameters, the capability of large language models has been dramatically improved, leading to human-like performances in understanding, language synthesizing, common-sense reasoning, etc. Such a major leap forward in general AI...
Identifying cases of intellectual property violation in multimedia files poses significant challenges for the Internet infrastructure, especially when dealing with extensive document collections. Typically, techniques used to tackle such issues can be categorized into either of two groups: proactive and reactive approaches. This article introduces an approach combining both...
Cloud service providers offer application providers with virtually infinite storage and computing resources, while providing cost-efficiency and various other quality of service (QoS) properties through a storage-as-a-service (StaaS) approach. Organizations also use multi-cloud or hybrid solutions by combining multiple public and/or private cloud service providers to avoid vendor...
For modern information systems, robust access control mechanisms are vital in safeguarding data integrity and ensuring the entire system’s security. This paper proposes a novel semi-supervised learning framework that leverages heterogeneous graph neural network-based embedding to encapsulate both the intricate relationships within the organizational structure and interactions...
Humanity needs to deliberate effectively at scale about highly complex and contentious problems. Current online deliberation tools—such as email, chatrooms, and forums—are however plagued by levels of discussion toxicity that deeply undercut the willingness and ability of the participants to engage in thoughtful, meaningful, deliberations. This has led many organizations to...
Recommending medications with electronic health records (EHRs) is a challenging task for data-driven clinical decision support systems. Most existing models learnt representations for medical concepts based on EHRs and make recommendations with the learnt representations. However, most medications appear in EHR datasets for limited times (the frequency distribution of medications...
Coverage Centrality is an important metric to evaluate vertex importance in road networks. However, current solutions have to compute the coverage centrality of all the vertices together, which is resource-wasting, especially when only some vertices centrality is required. In addition, they have poor adaption to the dynamic scenario because of the computation inefficiency. In...
Leveraging previously untapped data sources offers significant potential for value creation in the manufacturing sector. However, asset-heavy shop floors, extended machine replacement cycles, and equipment diversity necessitate considerable investments for achieving smart manufacturing, which can be particularly challenging for small businesses. Retrofitting presents a viable...
The privacy-preserving data publishing (PPDP) problem has gained substantial attention from research communities, industries, and governments due to the increasing requirements for data publishing and concerns about data privacy. However, achieving a balance between preserving privacy and maintaining data quality remains a challenging task in PPDP. This paper presents an...
With Bitcoin being universally recognized as the most popular cryptocurrency, more Bitcoin transactions are expected to be populated to the Bitcoin blockchain system. As a result, many transactions can encounter different confirmation delays. Concerned about this, it becomes vital to help a user understand (if possible) how long it may take for a transaction to be confirmed in...
Entity alignment plays an essential role in the integration of knowledge graphs (KGs) as it seeks to identify entities that refer to the same real-world objects across different KGs. Recent research has primarily centred on embedding-based approaches. Among these approaches, there is a growing interest in graph neural networks (GNNs) due to their ability to capture complex...
Predicting the severity of an illness is crucial in intensive care units (ICUs) if a patient‘s life is to be saved. The existing prediction methods often fail to provide sufficient evidence for time-critical decisions required in dynamic and changing ICU environments. In this research, a new method called MM-RNN (multi-task memory-fused recurrent neural network) was developed to...
Event extraction is an important, but challenging task. Many existing techniques decompose it into event and argument detection/classification subtasks, which are complex structured prediction problems. Generation-based extraction techniques lessen the complexity of the problem formulation and are able to leverage the reasoning capabilities of large pretrained language models...
In recent years, how to prevent the widespread transmission of infectious diseases in communities has been a research hot spot. Tracing close contact with infected individuals is one of the most severe problems. In this work, we present a model called Follower Prediction Graph Network (FPGN) to identify high-risk visitors, which is known as follower prediction. The model is...
Given a user dataset $$\varvec{U}$$ and an object dataset $$\varvec{I}$$ , a kNN join query in high-dimensional space returns the $$\varvec{k}$$ nearest neighbors of each object in dataset $$\varvec{U}$$ from the object dataset $$\varvec{I}$$ . The kNN join is a basic and necessary operation in many applications, such as databases, data mining, computer vision, multi-media...