Multicentric study on the reproducibility and robustness of PET-based radiomics features with a realistic activity painting phantom
PLOS ONE
RESEARCH ARTICLE
Multicentric study on the reproducibility and
robustness of PET-based radiomics features
with a realistic activity painting phantom
Piroska Kallos-Balogh ID1,2*, Norman Felix Vas1, Zoltan Toth ID3, Szabolcs Szakall4,
Peter Szabo5, Ildiko Garai1,5, Zita Kepes1, Attila Forgacs6, Lilla Szatmáriné Egeresi ID7,
Dahlbom Magnus8, Laszlo Balkay1,2
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
1 Division of Nuclear Medicine and Translational Imaging, Department of Medical Imaging, Faculty of
Medicine, University of Debrecen, Debrecen, Hungary, 2 Doctoral School of Molecular Medicine, Faculty of
Medicine, University of Debrecen, Debrecen, Hungary, 3 Medicopus Healthcare Provider and Public
Nonprofit Ltd., Somogy County Moritz Kaposi Teaching Hospital, Kaposvár, Hungary, 4 Pozitron-Diagnostics
Ltd., Budapest, Hungary, 5 Scanomed Ltd., Debrecen, Debrecen, Hungary, 6 Mediso Medical Imaging
Systems, Budapest, Hungary, 7 Division of Radiology and Imaging Science, Department of Medical Imaging,
Faculty of Medicine, University of Debrecen, Debrecen, Hungary, 8 Ahmanson Translational Theranostics
Division, Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, UCLA, Los
Angeles, California, United States of America
*
OPEN ACCESS
Citation: Kallos-Balogh P, Vas NF, Toth Z, Szakall
S, Szabo P, Garai I, et al. (2024) Multicentric study
on the reproducibility and robustness of PET-based
radiomics features with a realistic activity painting
phantom. PLoS ONE 19(10): e0309540. https://doi.
org/10.1371/journal.pone.0309540
Editor: Fei Yang, University of Miami, UNITED
STATES OF AMERICA
Received: March 6, 2024
Accepted: August 13, 2024
Published: October 24, 2024
Peer Review History: PLOS recognizes the
benefits of transparency in the peer review
process; therefore, we enable the publication of
all of the content of peer review and author
responses alongside final, published articles. The
editorial history of this article is available here:
https://doi.org/10.1371/journal.pone.0309540
Copyright: © 2024 Kallos-Balogh et al. This is an
open access article distributed under the terms of
the Creative Commons Attribution License, which
permits unrestricted use, distribution, and
reproduction in any medium, provided the original
author and source are credited.
Data Availability Statement: All relevant data are
within the manuscript and its Supporting
information files.
Abstract
Previously, we developed an "activity painting" tool for PET image simulation; however, it
could simulate heterogeneous patterns only in the air. We aimed to improve this phantom
technique to simulate arbitrary lesions in a radioactive background to perform relevant multicenter radiomic analysis. We conducted measurements moving a 22Na point source in a 20liter background volume filled with 5 kBq/mL activity with an adequately controlled robotic
system to prevent the surge of the water. Three different lesion patterns were "activitypainted" in five PET/CT cameras, resulting in 8 different reconstructions. We calculated 46
radiomic indeces (RI) for each lesion and imaging setting, applying absolute and relative discretization. Reproducibility and reliability were determined by the inter-setting coefficient of
variation (CV) and the intraclass correlation coefficient (ICC). Hypothesis tests were used to
compare RI between lesions. By simulating precisely the same lesions, we confirmed that
the reconstructed voxel size and the spatial resolution of different PET cameras were critical
for higher order RI. Considering conventional RIs, the SUVpeak and SUVmean proved the
most reliable (CV<10%). CVs above 25% are more common for higher order RIs, but we
also found that low CVs do not necessarily imply robust parameters but often rather insensitive RIs. Based on the hypothesis test, most RIs could clearly distinguish between the various lesions using absolute resampling. ICC analysis also revealed that most RIs were more
reproducible with absolute discretization. The activity painting method in a real radioactive
environment proved suitable for precisely detecting the radiomic differences derived from
the different camera settings and texture characteristics. We also found that inter-setting CV
is not an appropriate metric for analyzing RI parameters’ reliability and robustness. Although
multicentric cohorts are increasingly common in radiomics analysis, realistic texture
PLOS ONE | https://doi.org/10.1371/journal.pone.0309540 October 24, 2024
1 / 24
PLOS ONE
Funding: The author(s) received no specific
funding for this work.
Multicentric study on radiomics features with realistic activity painting phantom
phantoms can provide indispensable information on the sensitivity of an RI and how an individual RI parameter measures the texture.
Competing interests: The authors have declared
that no competing interests exist.
Abbreviations: CV, Coefficient of Variation; FOV,
Field of View; GLCM, grey level co-occurrence
matrix; GLRLM, grey-level zone length matrix;
GLZLM, grey-level run length matrix; ICC,
Intraclass Correlation Coefficient; NGLDM,
neighborhood grey-level difference matrix; RI,
Radiomics Indices; TLG, Total Lesion Glycolysis;
VOI, Volume of Interest.
Background
Since combined positron emission tomography (PET) and computed tomography (CT) serves
as a valuable means to measure the functional state of the human body in vivo, in a quantitative
way, this hybrid modality also ensures the accomplishment of comparative patient studies.
More recently, intensive focus has been placed upon investigating quantitative pattern parameters to extract latent information from PET images. This endeavor—denoted as radiomics–is
gaining increasing attention in medical imaging fields, including nuclear medicine [1–5].
Despite several promising results, the field of radiomics is still attempting to reach a breakthrough in usefulness, as several previous publications highlighted fundamental doubts,
mainly regarding methodology [2, 3, 6–12]. Additionally, scientific data indicate that the
reproducibility and robustness of the radiomic indices, also remain unsatisfactory [13–17].
The diagnostically relevant radiomic features found in individual clinical trials most commonly do not coincide, even for the same disease [17]. To assess and improve research quality
in the field of radiomics and machine learning, the “checklist for the evaluation of radiomics
research” (CLEAR) guideline, and the “methodological radiomics score” (METRICS) quality
scoring tools have recently been published with the support of the European Society of Medical
Imaging Informatics (EuSoMII) organization [18, 19]. CLEAR aims to set a standard to
improve the quality, and subsequently the reproducibility of radiomics research presentation
[18]. Including nine major categories, however, the Metrics is a scoring tool used to evaluate
the methodological quality of radiomics studies [19]. The first four s (...truncated)