HONeYBEE: enabling scalable multimodal AI in oncology through foundation model-driven embeddings

npj Digital Medicine, Oct 2025

Harmonized ONcologY Biomedical Embedding Encoder (HONeYBEE) is an open-source framework that integrates multimodal biomedical data for oncology applications. It processes clinical data (structured and unstructured), whole-slide images, radiology scans, and molecular profiles to generate unified patient-level embeddings using domain-specific foundation models and fusion strategies. These embeddings enable survival prediction, cancer-type classification, patient similarity retrieval, and cohort clustering. Evaluated on 11,400+ patients across 33 cancer types from The Cancer Genome Atlas (TCGA), clinical embeddings showed the strongest single-modality performance with 98.5% classification accuracy and 96.4% precision@10 in patient retrieval. They also achieved the highest survival prediction concordance indices across most cancer types. Multimodal fusion provided complementary benefits for specific cancers, improving overall survival prediction beyond clinical features alone. Comparative evaluation of four large language models revealed that general-purpose models like Qwen3 outperformed specialized medical models for clinical text representation, though task-specific fine-tuning improved performance on heterogeneous data such as pathology reports.

Article PDF cannot be displayed. You can download it here:

https://www.nature.com/articles/s41746-025-02003-4.pdf

HONeYBEE: enabling scalable multimodal AI in oncology through foundation model-driven embeddings

npj | digital medicine Article Published in partnership with Seoul National University Bundang Hospital https://doi.org/10.1038/s41746-025-02003-4 HONeYBEE: enabling scalable multimodal AI in oncology through foundation modeldriven embeddings Check for updates 1,2,4 1234567890():,; 1234567890():,; Aakash Tripathi 1,3,4 , Asim Waqas 3 2 1,2 , Matthew B. Schabath , Yasin Yilmaz & Ghulam Rasool Harmonized ONcologY Biomedical Embedding Encoder (HONeYBEE) is an open-source framework that integrates multimodal biomedical data for oncology applications. It processes clinical data (structured and unstructured), whole-slide images, radiology scans, and molecular profiles to generate unified patient-level embeddings using domain-specific foundation models and fusion strategies. These embeddings enable survival prediction, cancer-type classification, patient similarity retrieval, and cohort clustering. Evaluated on 11,400+ patients across 33 cancer types from The Cancer Genome Atlas (TCGA), clinical embeddings showed the strongest single-modality performance with 98.5% classification accuracy and 96.4% precision@10 in patient retrieval. They also achieved the highest survival prediction concordance indices across most cancer types. Multimodal fusion provided complementary benefits for specific cancers, improving overall survival prediction beyond clinical features alone. Comparative evaluation of four large language models revealed that general-purpose models like Qwen3 outperformed specialized medical models for clinical text representation, though task-specific fine-tuning improved performance on heterogeneous data such as pathology reports. Recent advances in computational oncology have been fueled by the increasing digitization of diverse biomedical data, including structured clinical variables (such as demographics, tumor staging, and laboratory results), unstructured clinical narratives (such as pathology reports, radiology reports, and physician notes), medical imaging (radiology scans and whole-slide images or WSI), and high-dimensional molecular profiles1–6. This wealth of multimodal data offers unprecedented opportunities to improve patient stratification, predict treatment response, and model disease progression2,3,6. In parallel, the adaptation of deep learning techniques from computer vision and natural language processing has enabled powerful solutions in these domains3,7. However, a fundamental challenge remains: the absence of robust, generalizable methods for integrating these heterogeneous data sources into unified representations that capture the biological complexity of cancer and support predictive modeling8. Although large-scale biomedical data is increasingly available and actively analyzed in oncology, it remains fragmented across distinct modalities, such as clinical data (structured variables and unstructured narratives), radiological and pathological imaging, and molecular profiles, which are typically processed separately. This siloed approach limits the ability to integrate complementary information across modalities for unified, patientcentered analysis8,9. Availability of large-scale datasets and advances in self- supervised learning have enabled the development of foundation models (FMs)1,10,11. These models, pretrained on text, imaging, or molecular data, have advanced feature extraction within individual modalities by learning latent representations that capture domain-specific patterns. These modality-specific embeddings can be adapted for downstream oncology tasks such as cancer classification or overall survival (OS) prediction. However, in practice, these models are typically applied within single- or dual-modality workflows, leaving the complementary information across modalities underutilized12. While multimodal data availability continues to expand in oncology, a critical bottleneck remains: the absence of standardized, scalable frameworks that integrate modality-specific embeddings into unified, patient-level representations that capture multimodal patient similarity and support downstream oncology tasks. We hypothesize that integrating FM-derived embeddings from multiple data modalities can yield richer and more clinically informative patient representations, particularly in settings where clinical data are incomplete or less structured. Rather than relying solely on model scaling or increasing parameter counts, we propose that fusing complementary information from diverse biomedical data types offers a powerful, orthogonal approach to enhance predictive performance in oncology. To test this hypothesis, we present HONeYBEE or Harmonized ONcologYBiomedical Embedding 1 Department of Machine Learning, Moffitt Cancer Center & Research Institute, Tampa, FL, USA. 2Department of Electrical Engineering, University of South Florida, Tampa, FL, USA. 3Departments of Cancer Epidemiology, Moffitt Cancer Center & Research Institute, Tampa, FL, USA. 4These authors contributed equally: Aakash e-mail: aakash.tripathi@moffitt.org Tripathi, Asim Waqas. npj Digital Medicine | (2025)8:622 1 https://doi.org/10.1038/s41746-025-02003-4 Encoder (https://lab-rasool.github.io/HoneyBee/). HONeYBEE is an opensource framework that generates individual patient-level embeddings from (i) structured and unstructured clinical data, (ii) pathology reports, (iii) radiologic images, (iv) WSIs, and (v) molecular profiles using modalityspecific FMs. HONeYBEE integrates these embeddings via concatenation, mean pooling, and Kronecker product fusion strategies to create unified, multimodal representations optimized for downstream oncology tasks, including cancer subtype classification, patient clustering, OS prediction, and patient similarity retrieval. While numerous models and pipelines exist for analyzing clinical, imaging, and molecular data, most current tools remain modality-specific and lack the flexibility to support unified, end-to-end multimodal workflows1,13. Existing methods are typically implemented as isolated codebases with rigid dependencies, domain-specific interfaces, and limited extensibility, which complicates reproducibility and impedes multimodal experimentation14. Moreover, the absence of standardized pipelines for modality-specific embedding generation, harmonization, and flexible fusion introduces substantial technical barriers, slowing the development of clinically meaningful AI models15. Addressing these limitations requires not only access to multimodal data but also modular infrastructure capable of generating, integrating, and utilizing diverse patient-level embeddings in scalable, reproducible ways. HONeYBEE directly addresses this gap by providing a modular, opensource framework for multimodal embedding generation and integration. Built around domain-specific FMs, HONeYBEE supports the standardized preprocessing and representation of five key oncology data modalities8,16–20. Each modality is processed through dedicated pipelines, producing modality-specific embe (...truncated)


This is a preview of a remote PDF: https://www.nature.com/articles/s41746-025-02003-4.pdf
Article home page: https://www.nature.com/articles/s41746-025-02003-4

Tripathi, Aakash, Waqas, Asim, Schabath, Matthew B., Yilmaz, Yasin, Rasool, Ghulam. HONeYBEE: enabling scalable multimodal AI in oncology through foundation model-driven embeddings, npj Digital Medicine, 2025, DOI: 10.1038/s41746-025-02003-4