Subcortical Segmentation of the Fetal Brain in 3D Ultrasound using Deep Learning
Linde S Hesse1
Moska Aliasi2
Felipe Moser1
the INTERGROWTH-21st Consortium
Monique C Haak2
Weidi Xie3
Mark Jenkinson4,5,6
Ana IL Namburete1

1 Ultrasound NeuroImage Analysis Group, University of Oxford
2 Department of Obstetrics and Fetal Medicine, Leiden University Medical Center, The Netherlands
3 Visual Geometry Group, Department of Engineering Science, University of Oxford, United Kingdom
4 Wellcome Centre for Integrative Neuroimaging, University of Oxford
5 Australian Institute for Machine Learning (AIML), University of Adelaide
6 South Australian Health and Medical Research Institute (SAHMRI)



The quantification of subcortical volume development from 3D fetal ultrasound can provide important diagnostic information during pregnancy monitoring. However, manual segmentation of subcortical structures in ultrasound volumes is time-consuming and challenging due to low soft tissue contrast, speckle and shadowing artifacts. For this reason, we developed a convolutional neural network (CNN) for the automated segmentation of the choroid plexus (CP), lateral posterior ventricle horns (LPVH), cavum septum pellucidum et vergae (CSPV), and cerebellum (CB) from 3D ultrasound. As ground-truth labels are scarce and expensive to obtain, we applied few-shot learning, in which only a small number of manual annotations (n = 9) are used to train a CNN. We compared training a CNN with only a few individually annotated volumes versus many weakly labelled volumes obtained from atlas-based segmentations. This showed that segmentation performance close to intra-observer variability can be obtained with only a handful of manual annotations. Finally, the trained models were applied to a large number (n = 278) of ultrasound image volumes of a diverse, healthy population, obtaining novel US-specific growth curves of the respective structures during the second trimester of gestation.


We train a multi-label CNN to predict 3D segmentations of subcortical structures using only a small number of manual annotations (N=9).

Segmentation Pipeline:

We compared training a CNN with N individually annotated 3D images (expert labels) versus training with many weakly labelled images (atlas labels) obtained from annotating N template images.

Atlas label generation

Schematic overview of atlas label generation. Standard whole brain templates (top row) were constructed for each GW and all structures (CB, CP, CSPV and LPVH) were annotated in these templates. Cluster-based template construction (bottom row) was only performed for the LPVH.


We obtained excellent segmentation performance using only 9 manual annotations for training:

We also compared segmentation performance for 3D images in their original acquisition orientation (aligned) versus aligned to the same coordinate system as preprocessing step (unaligned):

Resulting performance values after post-processing for a CNN trained with expert labels (θexp) and a CNN trained with atlas labels (θatl)

We then applied our trained networks to a large group (N = 278) of healthy fetusses to obtain US-specific growth curves:

Estimated structural volumes for subcortical structures as a function of GA. Volumes were fitted with a linear or quadratic fit (black), in which the quadratic term was only added if it was significant for both networks (per structure). The 95% prediction confidence intervals where also computed and are shown with red dashed lines. For each structure, samples were colored based on their residual for θexp, and the same colors (per sample) where used for θatl. Relative volume predictions can be found in the paper.


LH acknowledges the support of the UK Engineering and Physical Sciences Research Council (EPSRC) Doctoral Training Award. FM acknowledges the support and funding from the Engineering and Physical Sciences Research Council (EPSRC) and Medical Research Council (MRC) (EP/L016052/1), as well as the support from University College Oxford and its Oxford-Radcliffe benefaction. WX is supported by the UK Engineering and Physical Sciences Research Council (EPSRC) Programme Grant Seebibyte (EP/M013774/1) and Grant Visual AI (EP/T028572/1). MJ is supported by the National Institute for Health Research (NIHR) Oxford Biomedical Research Centre (BRC), and this research was funded by the Wellcome Trust [215573/Z/19/Z]. The Wellcome Centre for Integrative Neuroimaging is supported by core funding from the Wellcome Trust [203139/Z/16/Z]. AN is grateful for support from the UK Royal Academy of Engineering under the Engineering for Development Research Fellowships scheme, and to St Hilda’s College, Oxford. We would also like to acknowledge the INTERGROWTH-21st Consortium for collecting the data and making it available to us.

This template was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code can be found here.