Spatial transcriptomics have emerged as a powerful tool in biomedical research because of its ability to capture both the spatial contexts and abundance of the complete RNA transcript profile in organs of interest. However, limitations of the technology such as the relatively low resolution and comparatively insufficient sequencing depth make it difficult to reliably extract real...
Identifying potential associations among food, gut microbiota and disease is fundamental for elucidating interaction mechanisms and advancing personalized healthy dietary strategies. While computational methods have been extensively applied to predict microbiota–disease associations, methods on predicting food–microbiota relationships remain limited, particularly regarding higher...
In bottom-up proteomics using data-independent acquisition mass spectrometry (DIA-MS), quantitative measurements are obtained following multiple steps of protein fragmentation and ionization, which introduces cumulative errors and impairs the effectiveness of classical statistical methods. This study proposes an alternative statistical approach for testing group mean differences...
Small nucleolar RNAs (snoRNAs), a class of non-coding RNAs broadly distributed in eukaryotes, are emerging as pivotal regulators in the field of epigenomics. In addition to guiding 2’-O-methylation and pseudouridylation modifications at specific rRNA sites to maintain ribosomal stability and support protein synthesis, snoRNAs have been increasingly implicated in epigenetic...
Multimodal learning for classification tasks has recently gained significant attention in bioinformatics. Current approaches primarily concentrate on devising efficient deep learning architectures to capture features within and across modalities. However, they typically assume that each modality contributes equally to the classification objective, overlooking inherent biases...
Understanding how the molecules in our bodyrespond to the co-occurrence of two diseases in an individual (comorbidity) could lead tomechanistic insights into novel treatments for comorbid conditions. Studies have shown forinstance, that responses of our immune system to comorbid conditions could be more complexthan the union of immune responses to each disease occurring...
Drug–drug interactions (DDIs) frequently occur in combination therapy and may cause adverse effects or reduced efficacy. Existing computational approaches often fail to capture both the semantic information in drug sequences and the structural properties of drug molecules, limiting predictive power. We propose MDG-DDI, a deep learning framework that integrates a Frequent...
Sentence-transformers is a library that provides easy methods for generating embeddings for sentences, paragraphs, and images. Sentiment analysis, retrieval, and clustering are among the applications made possible by the embedding of texts in a vector space where similar texts are located close to one another. This study fine-tunes a sentence transformer model designed for...
In several contexts involving large collections of sets of biological sequences, a relevant problem is that of selecting significant groups of k-mers that characterize one set with regards to the others in the same collection. Here a software framework is proposed implementing a novel methodology for the extraction of k-mer dictionaries, from multiple sets of biological sequences...
Vasculogenic mimicry (VM) is the phenomenon whereby non-vascular tumor cells develop vascular-like structures. VM is linked to more aggressive tumor phenotypes including higher rates of metastasis and invasion and is potentially resistant to anti-angiogenic cancer therapies. VM is investigated in vitro using 3D assays with microscopy images capturing the resulting VM structures...
Understanding the interplay between diseases and genes is crucial for gaining deeper insights into disease mechanisms and optimizing therapeutic strategies. In recent years, various computational methods have been developed to uncover potential disease-gene associations. However, existing computational approaches for disease-gene association prediction still face two major...
Virtual Screening (VS) has become an essential tool in drug discovery, enabling the rapid and cost-effective identification of potential bioactive molecules. Among recent advancements, Graph Neural Networks (GNNs) have gained prominence for their ability to model complex molecular structures using graph-based representations. However, the integration of explainable methods to...
An important challenge in flow cytometry (FCM) data analysis is making comparisons of corresponding cell populations across multiple FCM samples. An interesting solution is creating a statistical mixture model for multiple samples simultaneously, as such a multi-sample model can characterize a heterogeneous set of samples, and facilitates direct comparison of cell populations...
In this paper, we introduce an image analysis approach for spatiotemporal segmentation, quantification, and visualization of movement or contraction patterns in 2D+t and 3D+t microscopy recordings of biological tissues. The development of this pipeline was motivated by the observation of contraction waves in the extra-embryonic membranes of the red flour beetle Tribolium...
Vasculature is an essential part of all tissues and organs and is involved in a wide range of different diseases. However, available software for blood vessel image analysis is often limited: Some only process two-dimensional data, others lack batch processing, putting a time burden on the user, while still others require tightly defined culturing methods and experimental...
The annotation of protein functions constitutes a key connection between genetic sequences, molecular conformations, and biochemical roles, driving progress in biomedical studies. Traditional experimental methods are time-consuming and resource-intensive, making it difficult to meet the demand for functional annotation of a vast number of proteins in the post-genomic era. The...
Identifying the Non-Alcoholic Steatohepatitis (NASH) that can cause liver failure-based morbidity remains a challenging research problem since there is no confirmed and effective approach for its early and accurate diagnosis yet. A large amount of medical data is collected to diagnose the NASH where the majority of them are redundant. This paper initially focuses on selecting the...
Allele-specific expression (ASE) analyses from RNA-Seq data provide quantitative insights into genomic imprinting and the genetic variants that affect transcription. Robust ASE analysis requires the integration of multiple computational steps, including read alignment, read counting, data visualization, and statistical testing—this complexity creates challenges for...
Multimodal visualizations are essential for identifying and interpreting complex relationships in diverse, high-dimensional biological datasets. However, existing visualization tools often lack native capabilities for embedding explicit statistical and computational annotations, hindering effective quantitative interpretation. We introduce MultiModalGraphics, an R package...
The emergent dynamics of complex gene regulatory networks govern various cellular processes. However, understanding these dynamics is challenging due to the difficulty of parameterizing the computational models for these networks, especially as the network size increases. Here, we introduce a simulation library, Gene Regulatory Interaction Network Simulator (GRiNS), to address...
There is a need for computational approaches to compare small organic molecules based on chemical similarity or for evaluating biochemical transformations. No tool currently exists to generate global molecular alignments for small organic molecules. The study introduces a new approach to molecular alignment in the Simplified Molecular Input Line Entry System (SMILES) format. This...
Pattern matching is a fundamental challenge in bioinformatics, especially in the fields of genomics, transcriptomics and proteomics. Efficient indexing structures, such as suffix arrays, are critical for searching large datasets. A sparse suffix array (SSA) retains only suffixes at every k-th position in the text, where k is the sparseness factor. While sparse suffix arrays offer...
The barriers to effective data analysis are sometimes insurmountable. Concerns ranging from privacy, security, and complexity can prevent researchers from using existing data analysis tools. JINet is a web browser-based platform intended to democratise access to advanced clinical and genomic data analysis software. It hosts numerous data analysis applications that are run in the...
The integration and analysis of multi-modal data are increasingly essential across various domains including bioinformatics. As the volume and complexity of such data grow, there is a pressing need for computational models that not only integrate diverse modalities but also leverage their complementary information to improve clustering accuracy and insights, especially when...
The identification of protein-protein interaction (PPI) plays a crucial role in understanding the mechanisms of complex biological processes. Current research in predicting PPI has shown remarkable progress by integrating protein information with PPI topology structure. Nevertheless, these approaches frequently overlook the dynamic nature of protein and PPI structures during...