The performance of sonographic antenatal birth weight assessment assisted with artificial intelligence compared to that of manual examiners at term

Archives of Gynecology and Obstetrics, Apr 2025

The aim of this study is to investigate the differences in the accuracy of sonographic antenatal fetal weight estimation at term with artificial intelligence (AI) compared to that of clinical sonographers at different levels of experience. This is a prospective cohort study where pregnant women at term scheduled for an imminent elective cesarean section were recruited. Three independent antenatal fetal weight estimations for each fetus were blindly measured by an experienced resident physician with level I qualification from the German Society for Ultrasound in Medicine (group 1), a senior physician with level II qualification (group 2), and an AI-supported algorithm (group 3) using Hadlock formula 3. The differences between the three groups and the actual birth weight were examined with a paired t-test. A variation within 10% of birth weight was deemed accurate, and the diagnostic accuracies of both groups 1 and 3 compared to group 2 were assessed using receiver operating characteristic (ROC) curves. The association between accuracy and potential influencing factors including gestational age, fetal position, maternal age, maternal body mass index (BMI), twins, neonatal gender, placental position, gestational diabetes, and amniotic fluid index was tested with univariate logistic regression. A sensitivity analysis by inflating the estimated weights by daily 25 grams (g) gain for days between examination and birth was conducted. 300 fetuses at a mean gestational week of 38.7 ± 1.1 were included in this study and examined on median 2 (2–4) days prior to delivery. Average birth weight was 3264.6 ± 530.7 g and the mean difference of the sonographic estimated fetal weight compared to birthweight was −203.6 ± 325.4 g, −132.2 ± 294.1 g, and −338.4 ± 606.2 g for groups 1, 2, and 3 respectively. The estimated weight was accurate in 62% (56.2%, 67.5%), 70% (64.5%, 75,1%), and 48.3% (42.6%, 54.1%) for groups 1, 2, and 3 respectively. The diagnostic accuracy measures for groups 1 and 3 compared to group 2 resulted in 55.7% (48.7%, 62.5%) and 68.6% (61.8%, 74.8%) sensitivity, 68.9% (58.3%, 78.2%) and 53.3% (42.5%, 63.9%) specificity and 0.62 (0.56, 0.68) and 0.61 (0.55, 0.67) area under the ROC curves respectively. There was no association between accuracy and the investigated variables. Adjusting for sensitivity analysis increased the accuracy to 68% (62.4%, 73.2%), 75% (69.7%, 79.8%), and 51.3% (45.5%, 57.1%), and changed the mean difference compared to birth weight to −136.1 ± 321.8 g, −64.7 ± 291.2 g, and −270.7 ± 605.2 g for groups 1, 2, and 3 respectively. The antenatal weight estimation by experienced specialists with high-level qualifications remains the gold standard and provides the highest precision. Nevertheless, the accuracy of this standard is less than 80% even after adjusting for daily weight gain. The tested AI-supported method exhibits high variability and requires optimization and validation before being reliably used in clinical practice.

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s00404-025-08042-2.pdf

The performance of sonographic antenatal birth weight assessment assisted with artificial intelligence compared to that of manual examiners at term

Archives of Gynecology and Obstetrics https://doi.org/10.1007/s00404-025-08042-2 RESEARCH The performance of sonographic antenatal birth weight assessment assisted with artificial intelligence compared to that of manual examiners at term Alex Horky1 · Marita Wasenitz1 · Carlotta Iacovella1 · Franz Bahlmann1 · Ammar Al Naimi1,2 Received: 10 March 2025 / Accepted: 22 April 2025 © The Author(s) 2025 Abstract Purpose The aim of this study is to investigate the differences in the accuracy of sonographic antenatal fetal weight estimation at term with artificial intelligence (AI) compared to that of clinical sonographers at different levels of experience. Methods This is a prospective cohort study where pregnant women at term scheduled for an imminent elective cesarean section were recruited. Three independent antenatal fetal weight estimations for each fetus were blindly measured by an experienced resident physician with level I qualification from the German Society for Ultrasound in Medicine (group 1), a senior physician with level II qualification (group 2), and an AI-supported algorithm (group 3) using Hadlock formula 3. The differences between the three groups and the actual birth weight were examined with a paired t-test. A variation within 10% of birth weight was deemed accurate, and the diagnostic accuracies of both groups 1 and 3 compared to group 2 were assessed using receiver operating characteristic (ROC) curves. The association between accuracy and potential influencing factors including gestational age, fetal position, maternal age, maternal body mass index (BMI), twins, neonatal gender, placental position, gestational diabetes, and amniotic fluid index was tested with univariate logistic regression. A sensitivity analysis by inflating the estimated weights by daily 25 grams (g) gain for days between examination and birth was conducted. Results 300 fetuses at a mean gestational week of 38.7 ± 1.1 were included in this study and examined on median 2 (2–4) days prior to delivery. Average birth weight was 3264.6 ± 530.7 g and the mean difference of the sonographic estimated fetal weight compared to birthweight was −203.6 ± 325.4 g, −132.2 ± 294.1 g, and −338.4 ± 606.2 g for groups 1, 2, and 3 respectively. The estimated weight was accurate in 62% (56.2%, 67.5%), 70% (64.5%, 75,1%), and 48.3% (42.6%, 54.1%) for groups 1, 2, and 3 respectively. The diagnostic accuracy measures for groups 1 and 3 compared to group 2 resulted in 55.7% (48.7%, 62.5%) and 68.6% (61.8%, 74.8%) sensitivity, 68.9% (58.3%, 78.2%) and 53.3% (42.5%, 63.9%) specificity and 0.62 (0.56, 0.68) and 0.61 (0.55, 0.67) area under the ROC curves respectively. There was no association between accuracy and the investigated variables. Adjusting for sensitivity analysis increased the accuracy to 68% (62.4%, 73.2%), 75% (69.7%, 79.8%), and 51.3% (45.5%, 57.1%), and changed the mean difference compared to birth weight to −136.1 ± 321.8 g, −64.7 ± 291.2 g, and −270.7 ± 605.2 g for groups 1, 2, and 3 respectively. Conclusion The antenatal weight estimation by experienced specialists with high-level qualifications remains the gold standard and provides the highest precision. Nevertheless, the accuracy of this standard is less than 80% even after adjusting for daily weight gain. The tested AI-supported method exhibits high variability and requires optimization and validation before being reliably used in clinical practice. Keywords Artificial intelligence · Sonography · Fetal weight assessment · Sonographer’s experience * Ammar Al Naimi 1 Department of Obstetrics and Gynecology, Buergerhospital - Dr. Senckenberg Foundation, Nibelungenallee 37‑41, 60318 Frankfurt, Hessen, Germany 2 Department of Obstetrics and Prenatal Medicine, Goethe University, University Hospital of Frankfurt, Hessen, Germany Vol.:(0123456789) Archives of Gynecology and Obstetrics What does this study add to the clinical work The tested AI method is not valid for estimating fetal weight at term and adjustments are required before it replaces the gold standard estimations by experienced specialists with high-level qualifications. Introduction Fetal growth reflects the intrauterine health of the fetus and is thus a key predictor of perinatal outcomes. Deviation from physiological growth trajectories, both in terms of growth restriction (FGR) or as excessive growth large for gestational age (LGA), is associated with increased cardiovascular, metabolic, and perinatal risks [1–4]. Fetal weight is influenced by a multitude of factors including genetic and anthropometric characteristics, such as familial predisposition and ethnic background [5], as well as maternal conditions such as diabetes or hypertension [6, 7]. Accurate prenatal weight estimation is essential for identifying abnormalities, initiating therapeutic interventions, optimizing management of birth, and improving both fetal and maternal outcomes. Ultrasound represents the gold standard for fetal weight estimation and relies primarily on biometric measurements, such as the head circumference, the abdominal circumference, and the length of the femur. The Hadlock formula combines these three measurements to derive an estimated fetal weight [3, 8]. However, external factors including maternal body mass index (BMI), fetal position, amniotic fluid volume, and experience level of the examiner can affect the accuracy of these sonographic measurements and subsequently the reliability of the estimated weight [9–11]. The application of artificial intelligence (AI) has gained increasing significance in the medical field [12–15]. AIbased systems have the potential to enhance the objectivity and the precision of sonographic fetal weight estimation, thereby minimizing subjective errors and improving clinical outcomes. We hypothesize that AI-assisted weight estimation is as accurate as the estimation of experienced sonographic experts, and that it could be resistant to external influencing factors. The main aim of this study is to evaluate the accuracy of sonographic antenatal fetal weight estimation with AI compared to sonographers at novice and expert levels of experience. Moreover, we aim to examine the association of potential influencing factors with the accuracy of estimated weights. Methods This is a prospective observational cohort study at a tertiary prenatal center where pregnant women with singletons or twins at term scheduled for an imminent elective cesarean section between 1 May and 31 December 2024 were recruited. Inclusion criteria were age over 18 years, informed consent and a planned cesarean section within 1–4 days. Women with lack of consent, onset of labor, rupture of membranes, and known fetal malformations were excluded. Recruitment took place during standardized preoperative consultation dates and three independent antenatal fetal weight estimations for each fetus were measured by an experienced resident physician with level I qualification from the German Society (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007/s00404-025-08042-2.pdf
Article home page: https://link.springer.com/article/10.1007/s00404-025-08042-2

Horky, Alex, Wasenitz, Marita, Iacovella, Carlotta, Bahlmann, Franz, Al Naimi, Ammar. The performance of sonographic antenatal birth weight assessment assisted with artificial intelligence compared to that of manual examiners at term, Archives of Gynecology and Obstetrics, 2025, pp. 1-7, DOI: 10.1007/s00404-025-08042-2