World Wide Web

World Wide Web: Internet and Web Information Systems (WWW) is an international, archival, peer-reviewed journal that covers all aspects of the Web, including ...

List of Papers (Total 176)

Determining modified versions of social media images

Social media platforms usually contain several modified versions of an image. This proliferation of versions questions the trust of social media images. We propose a novel framework to find modified versions of social media images using only their metadata. We consider several aspects to determine if an image is a modified version of another image. These aspects include topic of...

Causal integration in graph neural networks toward enhanced classification: benchmarking and advancements for robust performance

The expansion of Graph Neural Networks (GNNs) has highlighted the importance of evaluating their performance in real-world scenarios. However, existing evaluation frameworks often overlook the integration of causality, a critical component that is essential for more robust evaluation of GNNs. To address this gap, we present a benchmark study that systematically compares standard...

JAL: an algebra for JSON query optimization

As databases become larger and less structured, the JavaScript Object Notation (JSON) data format has risen in usage compared to other data formats like XML. At the same time, while extracting data from these large datasets efficiently is of obvious importance, there has been far less research regarding the optimization of JSON queries than there has relating to the querying of...

Integration and innovation of blockchain in Web3.0: current status and standardization prospects

In the Web3.0 era, which does not rely on any centralized organization and emphasizes user control, security, trustworthiness and the importance of data privacy, blockchain plays a key role. Its decentralization, security and trustworthiness and other characteristics have become Building the infrastructure of trusted interconnection and value interconnection in the Web3.0 era has...

Use of prompt-based learning for code-mixed and code-switched text classification

Code-mixing and code-switching (CMCS) are prevalent phenomena observed in social media conversations and various other modes of communication. When developing applications such as sentiment analysers and hate-speech detectors that operate on this social media data, CMCS text poses challenges. Recent studies have demonstrated that prompt-based learning of pre-trained language...

FSSDroid: Feature subset selection for Android malware detection

Android malware has become an increasingly important threat to individuals, organizations, and society, posing significant risks to data security, privacy, and infrastructure. As malware evolves in sophistication and complexity, the detection and mitigation of these malicious software instances have become more challenging and time consuming since the required number of features...

Hierarchical adaptive evolution framework for privacy-preserving data publishing

The growing need for data publication and the escalating concerns regarding data privacy have led to a surge in interest in Privacy-Preserving Data Publishing (PPDP) across research, industry, and government sectors. Despite its significance, PPDP remains a challenging NP-hard problem, particularly when dealing with complex datasets, often rendering traditional traversal search...

When large language models meet personalization: perspectives of challenges and opportunities

The advent of large language models marks a revolutionary breakthrough in artificial intelligence. With the unprecedented scale of training and model parameters, the capability of large language models has been dramatically improved, leading to human-like performances in understanding, language synthesizing, common-sense reasoning, etc. Such a major leap forward in general AI...

Using knowledge graphs for audio retrieval: a case study on copyright infringement detection

Identifying cases of intellectual property violation in multimedia files poses significant challenges for the Internet infrastructure, especially when dealing with extensive document collections. Typically, techniques used to tackle such issues can be categorized into either of two groups: proactive and reactive approaches. This article introduces an approach combining both...

Cloud storage cost: a taxonomy and survey

Cloud service providers offer application providers with virtually infinite storage and computing resources, while providing cost-efficiency and various other quality of service (QoS) properties through a storage-as-a-service (StaaS) approach. Organizations also use multi-cloud or hybrid solutions by combining multiple public and/or private cloud service providers to avoid vendor...

A heterogeneous graph-based semi-supervised learning framework for access control decision-making

For modern information systems, robust access control mechanisms are vital in safeguarding data integrity and ensuring the entire system’s security. This paper proposes a novel semi-supervised learning framework that leverages heterogeneous graph neural network-based embedding to encapsulate both the intricate relationships within the organizational structure and interactions...

The medium is the message: toxicity declines in structured vs unstructured online deliberations

Humanity needs to deliberate effectively at scale about highly complex and contentious problems. Current online deliberation tools—such as email, chatrooms, and forums—are however plagued by levels of discussion toxicity that deeply undercut the willingness and ability of the participants to engage in thoughtful, meaningful, deliberations. This has led many organizations to...

OntoMedRec: Logically-pretrained model-agnostic ontology encoders for medication recommendation

Recommending medications with electronic health records (EHRs) is a challenging task for data-driven clinical decision support systems. Most existing models learnt representations for medical concepts based on EHRs and make recommendations with the learnt representations. However, most medications appear in EHR datasets for limited times (the frequency distribution of medications...

Efficient processing of coverage centrality queries on road networks

Coverage Centrality is an important metric to evaluate vertex importance in road networks. However, current solutions have to compute the coverage centrality of all the vertices together, which is resource-wasting, especially when only some vertices centrality is required. In addition, they have poor adaption to the dynamic scenario because of the computation inefficiency. In...

Adaptive retrofitting for industrial machines: utilizing webassembly and peer-to-peer connectivity on the edge

Leveraging previously untapped data sources offers significant potential for value creation in the manufacturing sector. However, asset-heavy shop floors, extended machine replacement cycles, and equipment diversity necessitate considerable investments for achieving smart manufacturing, which can be particularly challenging for small businesses. Retrofitting presents a viable...

Privacy-preserving data publishing: an information-driven distributed genetic algorithm

The privacy-preserving data publishing (PPDP) problem has gained substantial attention from research communities, industries, and governments due to the increasing requirements for data publishing and concerns about data privacy. However, achieving a balance between preserving privacy and maintaining data quality remains a challenging task in PPDP. This paper presents an...

Enhancing bitcoin transaction confirmation prediction: a hybrid model combining neural networks and XGBoost

With Bitcoin being universally recognized as the most popular cryptocurrency, more Bitcoin transactions are expected to be populated to the Bitcoin blockchain system. As a result, many transactions can encounter different confirmation delays. Concerned about this, it becomes vital to help a user understand (if possible) how long it may take for a transaction to be confirmed in...

Entity alignment via graph neural networks: a component-level study

Entity alignment plays an essential role in the integration of knowledge graphs (KGs) as it seeks to identify entities that refer to the same real-world objects across different KGs. Recent research has primarily centred on embedding-based approaches. Among these approaches, there is a growing interest in graph neural networks (GNNs) due to their ability to capture complex...

Death comes but why: A multi-task memory-fused prediction for accurate and explainable illness severity in ICUs

Predicting the severity of an illness is crucial in intensive care units (ICUs) if a patient‘s life is to be saved. The existing prediction methods often fail to provide sufficient evidence for time-critical decisions required in dynamic and changing ICU environments. In this research, a new method called MM-RNN (multi-task memory-fused recurrent neural network) was developed to...

KC-GEE: knowledge-based conditioning for generative event extraction

Event extraction is an important, but challenging task. Many existing techniques decompose it into event and argument detection/classification subtasks, which are complex structured prediction problems. Generation-based extraction techniques lessen the complexity of the problem formulation and are able to leverage the reasoning capabilities of large pretrained language models...

FPGN: follower prediction framework for infectious disease prevention

In recent years, how to prevent the widespread transmission of infectious diseases in communities has been a research hot spot. Tracing close contact with infected individuals is one of the most severe problems. In this work, we present a model called Follower Prediction Graph Network (FPGN) to identify high-risk visitors, which is known as follower prediction. The model is...

Efficient continuous kNN join over dynamic high-dimensional data

Given a user dataset $$\varvec{U}$$ and an object dataset $$\varvec{I}$$ , a kNN join query in high-dimensional space returns the $$\varvec{k}$$ nearest neighbors of each object in dataset $$\varvec{U}$$ from the object dataset $$\varvec{I}$$ . The kNN join is a basic and necessary operation in many applications, such as databases, data mining, computer vision, multi-media...