Medical image segmentation has attracted increasing attention due to its practical clinical requirements. However, the prevalence of small targets still poses great challenges for accurate segmentation. In this paper, we propose a novel locally enhanced transformer network (LET-Net) that combines the strengths of transformer and convolution to address this issue. LET-Net utilizes...
The video services that account for the majority of global network traffic consume significant amounts of electricity and network resources to meet the large-scale demand of users. Variations in user interest and social influence lead to high maintenance costs for achieving a dynamic balance between supply and demand, which negatively impacts the sustainable development of video...
Unsupervised 2D image-based 3D model retrieval aims at retrieving images from the gallery of 3D models by the given 2D images. Despite the encouraging progress made in this task, there are still two significant limitations: (1) feature alignment of 2D images and 3D model gallery is still difficult due to the huge gap between the two modalities. (2) The important view information...
Point-level temporal action localization (PTAL) aims to locate action instances in untrimmed videos with only one timestamp annotation for each action instance. Existing methods adopt the localization-by-classification paradigm to locate action boundaries in the temporal class activation map (TCAM) by thresholding, also known as TCAM-based method. However, TCAM-based methods are...
As a critical component of ensuring the safe and stable operation of trains, the detection of bird’s nests on the rail catenary has always been essential. Low-resolution images and the lack of labelled data, however, make it difficult to detect smaller bird’s nests (those occupying small pixels in the input image). Previous solution relies on manual online patrol or offline video...
Humor can be induced by various signals in the visual, linguistic, and vocal modalities emitted by humans. Finding humor in videos is an interesting but challenging task for an intelligent system. Previous methods predict humor in the sentence level given some text (e.g., speech transcript), sometimes together with other modalities, such as videos and speech. Such methods ignore...
Medical graduates lack procedural skills experience required to manage emergencies. Recent advances in virtual reality (VR) technology enable the creation of highly immersive learning environments representing easy-to-use and affordable solutions for training with simulation. However, the feasibility in compulsory teaching, possible side effects of immersion, perceived stress...
The coronavirus disease 2019, initially named 2019-nCOV (COVID-19) has been declared a global pandemic by the World Health Organization in March 2020. Because of the growing number of COVID patients, the world’s health infrastructure has collapsed, and computer-aided diagnosis has become a necessity. Most of the models proposed for the COVID-19 detection in chest X-rays do image...
Recently, the infectious disease COVID-19 remains to have a catastrophic effect on the lives of human beings all over the world. To combat this deadliest disease, it is essential to screen the affected people quickly and least inexpensively. Radiological examination is considered the most feasible step toward attaining this objective; however, chest X-ray (CXR) and computed...
With the introduction of concepts for virtual interaction and digital doubles, a rich scenario has been created for embodied avatars to strive. These avatars, more recently referred to as digital humans, have become a popular area of research, resulting in various techniques and methods that focus on improving the perception of their realism, fidelity, emphatic response, and...
The World Health Organization (WHO) declared a pandemic in response to the coronavirus COVID-19 in 2020, which resulted in numerous deaths worldwide. Although the disease appears to have lost its impact, millions of people have been affected by this virus, and new infections still occur. Identifying COVID-19 requires a reverse transcription-polymerase chain reaction test (RT-PCR...
The advances in human face recognition (FR) systems have recorded sublime success for automatic and secured authentication in diverse domains. Although the traditional methods have been overshadowed by face recognition counterpart during this progress, computer vision gains rapid traction, and the modern accomplishments address problems with real-world complexity. However...
With the latest developments in deep neural networks, the convolutional neural network (CNN) has made considerable progress in the area of foreground detection. However, the top-rank background subtraction algorithms for foreground detection still have many shortcomings. It is challenging to extract the true foreground against complex background. To tackle the bottleneck, we...
The pandemic that the SARS-CoV-2 originated in 2019 is continuing to cause serious havoc on the global population’s health, economy, and livelihood. A critical way to suppress and restrain this pandemic is the early detection of COVID-19, which will help to control the virus. Chest X-rays are one of the more straightforward ways to detect the COVID-19 virus compared to the...
Unlike deep learning which requires large training datasets, correlation filter-based trackers like Kernelized Correlation Filter (KCF) use implicit properties of tracked images (circulant structure) for training in real time. Despite their popularity in tracking applications, there exists significant drawbacks of the tracker in cases like occlusions and out-of-view scenarios...
With the innovation and development of advanced video editing technology and the widespread use of video information and services in our society, it is increasingly necessary to maintain the reliability of video information. As a result, sensitive video contents in various fields such as surveillance, medical, and others should be secured against attempts to alter them because...
The growth in the use of social media platforms such as Facebook and Twitter over the past decade has significantly facilitated and improved the way people communicate with each other. However, the information that is available and shared online is not always credible. These platforms provide a fertile ground for the rapid propagation of breaking news along with other misleading...
Deep learning has demonstrated remarkable performance in the medical domain, with accuracy that rivals or even exceeds that of human experts. However, it has a significant problem that these models are “black-box” structures, which means they are opaque, non-intuitive, and difficult for people to understand. This creates a barrier to the application of deep learning models in...