Publications | Satsuma

June 2024 Computerized Medical Imaging and Graphics

Enhancing cancer prediction in challenging screen-detected incident lung nodules using time-series deep learning

Lung cancer screening (LCS) using annual computed tomography (CT) scanning significantly reduces mortality by detecting cancerous lung nodules at an earlier stage. Deep learning algorithms can improve nodule malignancy risk stratification. However, they have typically been used to analyse single time point CT data when detecting malignant nodules on either baseline or incident CT LCS rounds. Deep learning algorithms have the greatest value in two aspects. These approaches have great potential in assessing nodule change across time-series CT scans where subtle changes may be challenging to identify using the human eye alone. Moreover, they could be targeted to detect nodules developing on incident screening rounds, where cancers are generally smaller and more challenging to detect confidently. Here, we show the performance of our Deep learning-based Computer-Aided Diagnosis model integrating Nodule and Lung imaging data with clinical Metadata Longitudinally (DeepCAD-NLM-L) for malignancy prediction. DeepCAD-NLM-L showed improved performance (AUC = 88%) against models utilizing single time-point data alone. DeepCAD-NLM-L also demonstrated comparable and complementary performance to radiologists when interpreting the most challenging nodules typically found in LCS programs. It also demonstrated similar performance to radiologists when assessed on out-of-distribution imaging dataset. The results emphasize the advantages of using time-series and multimodal analyses when interpreting malignancy risk in LCS.

Ashkan Pakzad, Tony Cheung, Coline H. M. Van Moorsel, Kin Quan, Nesrin Mogulkoc, Brian J. Bartholmai, Hendrik W. Van Es, Alper Ezircan, Frouke Van Beek, Marcel Veltkamp, Ronald Karwoski, Tobias Peikert, Ryan D. Clay, Finbar Foley, Cassandra Braun, Recep Savas, Carole Sudre, Tom Doel, Daniel C. Alexander, Peter Wijeratne, David Hawkes, Yipeng Hu, John R Hurst, Joseph Jacob

March 2024 Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization

Evaluation of automated airway morphological quantification for assessing fibrosing lung disease

Patients with lung damage by scarring typically have abnormally larger airways. This difference in airway size between patients and healthy individuals can be seen with 3D xrays, known as a CT scan. We present a new software program called AirQuant to measure all airways of a patient’s CT scan. We compare measurements between 14 healthy individuals and 14 diseased patients. We found that airway disease was worse in the lower lungs and furthermore their airways were also more twisted. Measurements used by AirQuant, has potential as new markers in imaging to help us understand how bad a patient’s disease is.

Shahab Aslani, Joseph Jacob

November 2023 Medical Imaging and Computer-Aided Diagnosis (MICAD) 2022

Optimising Chest X-Rays for Image Analysis by Identifying and Removing Confounding Factors

During the COVID-19 pandemic, the sheer volume of imaging performed in an emergency setting for COVID-19 diagnosis has resulted in a wide variability of clinical CXR acquisitions. This variation is seen in the CXR projections used, image annotations added and in the inspiratory effort and degree of rotation of clinical images. The image analysis community has attempted to ease the burden on overstretched radiology departments during the pandemic by developing automated COVID-19 diagnostic algorithms, the input for which has been CXR imaging. Large publicly available CXR datasets have been leveraged to improve deep learning algorithms for COVID-19 diagnosis. Yet the variable quality of clinically-acquired CXRs within publicly available datasets could have a profound effect on algorithm performance. COVID-19 diagnosis may be inferred by an algorithm from non-anatomical features on an image such as image labels. These imaging shortcuts may be dataset-specific and limit the generalisability of AI systems. Understanding and correcting key potential biases in CXR images is therefore an essential first step prior to CXR image analysis. In this study, we propose a simple and effective step-wise approach to pre-processing a COVID-19 chest X-ray dataset to remove undesired biases. We perform ablation studies to show the impact of each individual step. The results suggest that using our proposed pipeline could increase accuracy of the baseline COVID-19 detection algorithm by up to 13%.

Ahmed H. Shahin, An Zhao, Alexander C. Whitehead, Daniel C. Alexander, Joseph Jacob, David Barber

October 2023 Medical Image Analysis

CenTime: Event-conditional modelling of censoring in survival analysis

Survival analysis is a valuable tool for estimating the time until specific events, such as death or cancer recurrence, based on baseline observations. This is particularly useful in healthcare to prognostically predict clinically important events based on patient data. However, existing approaches often have limitations; some focus only on ranking patients by survivability, neglecting to estimate the actual event time, while others treat the problem as a classification task, ignoring the inherent time-ordered structure of the events. Additionally, the effective utilisation of censored samples data points where the event time is unknown is essential for enhancing the model’s predictive accuracy. In this paper, we introduce CenTime, a novel approach to survival analysis that directly estimates the time to event. Our method features an innovative event-conditional censoring mechanism that performs robustly even when uncensored data is scarce. We demonstrate that our approach forms a consistent estimator for the event model parameters, even in the absence of uncensored data. Furthermore, CenTime is easily integrated with deep learning models with no restrictions on batch size or the number of uncensored samples. We compare our approach to standard survival analysis methods, including the Cox proportional-hazard model and DeepHit. Our results indicate that CenTime offers state-of-the-art performance in predicting time-to-death while maintaining comparable ranking performance. Our implementation is publicly available at https://github.com/ahmedhshahin/CenTime.

Yaozhi Lu, Shahab Aslani, An Zhao, Ahmed H. Shahin, David Barber, Mark Emberton, Daniel C. Alexander, Joseph Jacob

August 2023 Heliyon

A hybrid CNN-RNN approach for survival analysis in a Lung Cancer Screening study

In this study, we present a hybrid CNN-RNN approach to investigate long-term survival of subjects in a lung cancer screening study. Subjects who died of cardiovascular and respiratory causes were identified whereby the CNN model was used to capture imaging features in the CT scans and the RNN model was used to investigate time series and thus global information. To account for heterogeneity in patients’ follow-up times, two different variants of LSTM models were evaluated, each incorporating different strategies to address irregularities in follow-up time. The models were trained on subjects who underwent cardiovascular and respiratory deaths and a control cohort matched to participant age, gender, and smoking history. The combined model can achieve an AUC of 0.76 which outperforms humans at cardiovascular mortality prediction. The corresponding F1 and Matthews Correlation Coefficient are 0.63 and 0.42 respectively. The generalisability of the model is further validated on an ‘external’ cohort. The same models were applied to survival analysis with the Cox Proportional Hazard model. It was demonstrated that incorporating the follow-up history can lead to improvement in survival prediction. The Cox neural network can achieve an IPCW C-index of 0.75 on the internal dataset and 0.69 on an external dataset. Delineating subjects at increased risk of cardiorespiratory mortality can alert clinicians to request further more detailed functional or imaging studies to improve the assessment of cardiorespiratory disease burden. Such strategies may uncover unsuspected and under-recognised pathologies thereby potentially reducing patient morbidity.

Tony Cheung, Ashkan Pakzad, Nesrin Mogulkoc, Sarah Needleman, Bojidar Rangelov, Eyjolfur Gudmundsson, An Zhao, Mariam Abbas, Davina McLaverty, Dimitrios Asimakopoulos, Robert Chapman, Recep Savas, Sam M. Janes, Yipeng Hu, Daniel C. Alexander, John R. Hurst, Joseph Jacob

July 2023 European Radiology

Automated airway quantification associates with mortality in idiopathic pulmonary fibrosis

AirQuant generates measures of intersegmental tapering and segmental tortuosity which associates with mortality in IPF independent of established measures of disease severity.

Eyjolfur Gudmundsson, Joseph Jacob

May 2023 ERJ OPEN RESEARCH

Delineating associations of progressive pleuroparenchymal fibroelastosis in patients with pulmonary fibrosis

Computerised pleuroparenchymal fibroelastosis (PPFE) progression was found to associate with mortality in idiopathic pulmonary fibrosis and fibrotic hypersensitivity pneumonitis but PPFE did not correlate strongly with measures of fibrosis progression.

Mou-Cheng Xu, Yu Kun Zhou, Chen Jin, Marius de Groot, Daniel C. Alexander, Neil P. Oxtoby, Joseph Jacob

May 2023 IEEE Transactions on Medical Imaging (TMI)

MisMatch: Calibrated Segmentation via Consistency on Differential Morphological Feature Perturbations with Limited Labels

Semi-supervised learning (SSL) is a promising machine learning paradigm to address the ubiquitous issue of label scarcity in medical imaging. The state-of-the-art SSL methods in image classification utilise consistency regularisation to learn unlabelled predictions which are invariant to input level perturbations. However, image level perturbations violate the cluster assumption in the setting of segmentation. Moreover, existing image level perturbations are hand-crafted which could be sub-optimal. In this paper, we propose MisMatch, a semi-supervised segmentation framework based on the consistency between paired predictions which are derived from two differently learnt morphological feature perturbations. MisMatch consists of an encoder and two decoders. One decoder learns positive attention for foreground on unlabelled data thereby generating dilated features of foreground. The other decoder learns negative attention for foreground on the same unlabelled data thereby generating eroded features of foreground. We normalise the paired predictions of the decoders, along the batch dimension. A consistency regularisation is then applied between the normalised paired predictions of the decoders. We evaluate MisMatch on four different tasks. Firstly, we develop a 2D U-net based MisMatch framework and perform extensive cross-validation on a CT-based pulmonary vessel segmentation task and show that MisMatch statistically outperforms state-of-the-art semi-supervised methods. Secondly, we show that 2D MisMatch outperforms state-of-the-art methods on an MRI-based brain tumour segmentation task. We then further confirm that 3D V-net based MisMatch outperforms its 3D counterpart based on consistency regularisation with input level perturbations, on two different tasks including, left atrium segmentation from 3D CT images and whole brain tumour segmentation from 3D MRI images. Lastly, we find that the performance improvement of MisMatch over the baseline might originate from its better calibration. This also implies that our proposed AI system makes safer decisions than the previous methods.

Shahab Aslani, Joseph Jacob

February 2023 Clinical Radiology

Utilisation of deep learning for COVID-19 diagnosis

The COVID-19 pandemic that began in 2019 has resulted in millions of deaths worldwide. Over this period, the economic and healthcare consequences of COVID-19 infection in survivors of acute COVID-19 infection have become apparent. During the course of the pandemic, computer analysis of medical images and data have been widely used by the medical research community. In particular, deep-learning methods, which are artificial intelligence (AI)-based approaches, have been frequently employed. This paper provides a review of deep-learning-based AI techniques for COVID-19 diagnosis using chest radiography and computed tomography. Thirty papers published from February 2020 to March 2022 that used two-dimensional (2D)/three-dimensional (3D) deep convolutional neural networks combined with transfer learning for COVID-19 detection were reviewed. The review describes how deep-learning methods detect COVID-19, and several limitations of the proposed methods are highlighted.

Ashkan Pakzad, Mou-Cheng Xu, Tony Cheung, Marie Vermant, Tinne Goos, Laurens J De Sadeleer, Stijn E Verleden, Wim A Wuyts, John R Hurst, Joseph Jacob

October 2022 MICCAI Workshop on Deep Generative Models

Airway measurement by refinement of synthetic images improves mortality prediction in idiopathic pulmonary fibrosis

Several chronic lung diseases, like idiopathic pulmonary fibrosis (IPF) are characterised by abnormal dilatation of the airways. Quantification of airway features on computed tomography (CT) can help characterise disease severity and progression. Physics based airway measurement algorithms that have been developed have met with limited success, in part due to the sheer diversity of airway morphology seen in clinical practice. Supervised learning methods are not feasible due to the high cost of obtaining precise airway annotations. We propose synthesising airways by style transfer using perceptual losses to train our model: Airway Transfer Network (ATN). We compare our ATN model with a state-of-the-art GAN-based network (simGAN) using a) qualitative assessment; b) assessment of the ability of ATN and simGAN based CT airway metrics to predict mortality in a population of 113 patients with IPF. ATN was shown to be quicker and easier to train than simGAN. ATN-based airway measurements showed consistently stronger associations with mortality than simGAN-derived airway metrics on IPF CTs. Airway synthesis by a transformation network that refines synthetic data using perceptual losses is a realistic alternative to GAN-based methods for clinical CT analyses of idiopathic pulmonary fibrosis. Our source code can be found at https://github.com/ashkanpakzad/ATN that is compatible with the existing open-source airway analysis framework, AirQuant.

Mou-Cheng Xu, Yu Kun Zhou, Chen Jin, Marius de Groot, Daniel C Alexander, Neil P Oxtoby, Yipeng Hu, Joseph Jacob

September 2022 Medical Image Computing and Computer Assisted Interventions (MICCAI)

Bayesian Pseudo Labels: Expectation Maximization for Robust and Efficient Semi Supervised Segmentation

This paper concerns pseudo labelling in segmentation. Our contribution is fourfold. Firstly, we present a new formulation of pseudo-labelling as an Expectation-Maximization (EM) algorithm for clear statistical interpretation. Secondly, we propose a semi-supervised medical image segmentation method purely based on the original pseudo labelling, namely SegPL. We demonstrate SegPL is a competitive approach against state-of-the-art consistency regularisation based methods on semi-supervised segmentation on a 2D multi-class MRI brain tumour segmentation task and a 3D binary CT lung vessel segmentation task. The simplicity of SegPL allows less computational cost comparing to prior methods. Thirdly, we demonstrate that the effectiveness of SegPL may originate from its robustness against out-of-distribution noises and adversarial attacks. Lastly, under the EM framework, we introduce a probabilistic generalisation of SegPL via variational inference, which learns a dynamic threshold for pseudo labelling during the training. We show that SegPL with variational inference can perform uncertainty estimation on par with the gold-standard method Deep Ensemble.

Yaozhi Lu, Shahab Aslani, Mark Emberton, Daniel C. Alexander, Joseph Jacob

March 2022 IEEE Access

Deep Learning Based Long Term Mortality Prediction in the National Lung Screening Trial

In this study, the long-term mortality in the National Lung Screening Trial (NLST) was investigated using a deep learning-based method. Binary classification of the non-lung-cancer mortality (i.e. cardiovascular and respiratory mortality) was performed using neural network models centered around a 3D-ResNet. The models were trained on a participant age, gender, and smoking history matched cohort. Utilising both the 3D CT scan and clinical information, the models can achieve an AUC of 0.73 which outperforms humans at cardiovascular mortality prediction. By interpreting the trained models with 3D saliency maps, we examined the features on the CT scans that correspond to the mortality signal. The saliency maps can potentially assist the clinicians’ and radiologists’ to identify regions of concern on the image that may indicate the need to adopt preventative healthcare management strategies to prolong the patients’ life expectancy.

Ashkan Pakzad, Joseph Jacob

March 2022 Clinics in Chest Medicine

Radiology of Bronchiectasis

Bronchiectasis is a radiological diagnosis made using computed tomographic (CT) imaging. Although visual CT assessment is necessary for the diagnosis of bronchiectasis, visual assessment of disease severity and progression is challenging. Computer tools offer the potential to improve the characterization of lung damage in patients with bronchiectasis. Newer imaging techniques such as MRI with hyperpolarized gas inhalation have the potential to identify early forms of disease and are without the constraints of requiring ionizing radiation exposure.

Ahmed H. Shahin, Joseph Jacob, Daniel C. Alexander, David Barber

February 2022 Medical Imaging with Deep Learning (MIDL)

Survival Analysis for Idiopathic Pulmonary Fibrosis using CT Images and Incomplete Clinical Data

Idiopathic Pulmonary Fibrosis (IPF) is an inexorably progressive fibrotic lung disease with a variable and unpredictable rate of progression. CT scans of the lungs inform clinical assessment of IPF patients and contain pertinent information related to disease progression. In this work, we propose a multi-modal method that uses neural networks and memory banks to predict the survival of IPF patients using clinical and imaging data. The majority of clinical IPF patient records have missing data (e.g. missing lung function tests). To this end, we propose a probabilistic model that captures the dependencies between the observed clinical variables and imputes missing ones. This principled approach to missing data imputation can be naturally combined with a deep survival analysis model. We show that the proposed framework yields significantly better survival analysis results than baselines in terms of concordance index and integrated Brier score. Our work also provides insights into novel image-based biomarkers that are linked to mortality.

Mou-Cheng Xu, Yu Kun Zhou, Chen Jin, Stefano B Blumberg, Frederick Wilson, Marius de Groot, Daniel C. Alexander, Neil P. Oxtoby, Joseph Jacob

February 2022 Medical Imaging with Deep Learning (MIDL)

Learning Morphological Feature Perturbation for Semi-Supervised Segmentation

We propose MisMatch, a novel consistency-driven semi-supervised segmentation framework which produces predictions that are invariant to learnt feature perturbations. MisMatch consists of an encoder and a two-head decoders. One decoder learns positive attention to the foreground regions of interest (RoI) on unlabelled images thereby generating dilated features. The other decoder learns negative attention to the foreground on the same unlabelled images thereby generating eroded features. We then apply a consistency regularisation on the paired predictions. MisMatch outperforms state-of-the-art semi-supervised methods on a CT-based pulmonary vessel segmentation task and a MRI-based brain tumour segmentation task. In addition, we show that the effectiveness of MisMatch comes from better model calibration than its supervised learning counterpart.

Ashkan Pakzad, Tony Cheung, Kin Quan, Nesrin Mogulkoc, Coline H. M. Van Moorsel, Brian J. Bartholmai, Hendrik W. Van Es, Alper Ezircan, Frouke Van Beek, Marcel Veltkamp, Ronald Karwoski, Tobias Peikert, Ryan D. Clay, Finbar Foley, Cassandra Braun, Recep Savas, Carole Sudre, Tom Doel, Daniel C. Alexander, Peter Wijeratne, David Hawkes, Yipeng Hu, John R Hurst, Joseph Jacob

November 2021 arXiv

Evaluation of automated airway morphological quantification for assessing fibrosing lung disease

Patients with lung damage by scarring typically have abnormally larger airways. This difference in airway size between patients and healthy individuals can be seen with 3D xrays, known as a CT scan. We present a new software program called AirQuant to measure all airways of a patient’s CT scan. We compare measurements between 14 healthy individuals and 14 diseased patients. We found that airway disease was worse in the lower lungs and furthermore their airways were also more twisted. Measurements used by AirQuant, has potential as new markers in imaging to help us understand how bad a patient’s disease is.

Tony Cheung, Robert Bell, Arjun Nair, Leon Menezies, Riyaz Patel, Simon Wan, Kacy Chou, Jiahang Chen, Ryo Torii, Rhodri Davies, James Moon, Daniel C. Alexander, Joseph Jacob

February 2021 medRxiv

A computationally efficient approach to segmentation of the aorta and coronary arteries using deep learning

A fully automatic two-dimensional Unet model is proposed to segment aorta and coronary arteries in computed tomography images. Two models are trained to segment two regions of interest, (1) the aorta and the coronary arteries or (2) the coronary arteries alone. Our method achieves 91.20% and 88.80% dice similarity coefficient accuracy on regions of interest 1 and 2 respectively. Compared with a semi-automatic segmentation method, our model performs better when segmenting the coronary arteries alone. The performance of the proposed method is comparable to existing published two-dimensional or three-dimensional deep learning models. Furthermore, the algorithmic and graphical processing unit memory efficiencies are maintained such that the model can be deployed within hospital computer networks where graphical processing units are typically not available.

Eyjólfur Guðmundsson, An Zhao, Nesrin Mogulkoc, Iain Stewart, Mark G. Jones, Coline H. M. Van Moorsel, Recep Savas, Christopher J. Brereton, Hendrik W. Van Es, Omer Unat, Katarina Pontoppidan, Frouke Van Beek, Marcel Veltkamp, Bahareh Gholipour, Arjun Nair, Athol U. Wells, Sam M. Janes, Daniel C. Alexander, Joseph Jacob

January 2021 EClinicalMedicine

Pleuroparenchymal fibroelastosis in idiopathic pulmonary fibrosis: Survival analysis using visual and computer-based computed tomography assessment

Background Idiopathic pulmonary fibrosis (IPF) and pleuroparenchymal fibroelastosis (PPFE) are known to have poor outcomes but detailed examinations of prognostic significance of an association between these morphologic processes are lacking. Methods Retrospective observational study of independent derivation and validation cohorts of IPF populations. Upper-lobe PPFE extent was scored visually (vPPFE) as categories of absent, moderate, marked. Computerised upper-zone PPFE extent (cPPFE) was examined continuously and using a threshold of 2·5% pleural surface area. vPPFE and cPPFE were evaluated against 1-year FVC decline (estimated using mixed-effects models) and mortality. Multivariable models were adjusted for age, gender, smoking history, antifibrotic treatment and diffusion capacity for carbon monoxide. Findings PPFE prevalence was 49% (derivation cohort, n = 142) and 72% (validation cohort, n = 145). vPPFE marginally contributed 3–14% to variance in interstitial lung disease (ILD) severity across both cohorts. In multivariable models, marked vPPFE was independently associated with 1-year FVC decline (derivation: regression coefficient 18·3, 95 CI 8·47–28·2%; validation: 7·51, 1·85–13·2%) and mortality (derivation: hazard ratio [HR] 7·70, 95% CI 3·50–16·9; validation: HR 3·01, 1·33–6·81). Similarly, continuous and dichotomised cPPFE were associated with 1-year FVC decline and mortality (cPPFE ≥ 2·5% derivation: HR 5·26, 3·00–9·22; validation: HR 2·06, 1·28–3·31). Individuals with cPPFE ≥ 2·5% or marked vPPFE had the lowest median survival, the cPPFE threshold demonstrated greater discrimination of poor outcomes at two and three years than marked vPPFE. Interpretation PPFE quantification supports distinction of IPF patients with a worse outcome independent of established ILD severity measures. This has the potential to improve prognostic management and elucidate separate pathways of disease progression. Funding This research was funded in whole or in part by the Wellcome Trust [209,553/Z/17/Z] and the NIHR UCLH Biomedical Research Centre, UK.

An Zhao, Eyjólfur Guðmundsson, Nesrin Mogulkoc, Mark G. Jones, Coline H. M. Van Moorsel, Tamera J. Corte, Chiara Romei, Recep Savas, Christopher J. Brereton, Hendrik W. Van Es, Helen Jo, Annalisa De Liperi, Omer Unat, Katarina Pontoppidan, Frouke Van Beek, Marcel Veltkamp, Peter Hopkins, Yuben Moodley, Alessandro Taliani, Laura Tavanti, Bahareh Gholipour, Arjun Nair, Sam Janes, Iain Stewart, David Barber, Daniel C. Alexander, Athol U. Wells, Joseph Jacob

January 2021 ERJ Open Research

Mortality in combined pulmonary fibrosis and emphysema patients is determined by the sum of pulmonary fibrosis and emphysema

Emphysema is one of the most common pulmonary comorbidities of idiopathic pulmonary fibrosis (IPF), presenting in about one-third of IPF patients [1]. The term combined pulmonary fibrosis and emphysema (CPFE) has been used to describe a potential phenotype characterised by the coexistence of upper lobe-predominant emphysema, lower lobe-predominant fibrosis and relative preservation of lung volumes (forced vital capacity; FVC) in the context of a disproportionately reduced gas transfer (diffusing capacity of the lung for carbon monoxide; DLCO) [1–3]. With regard to patient survival, it remains unclear whether mortality in patients with CPFE reflects the cumulative effects of two disease processes (emphysema and fibrosis), or whether CPFE represents a distinct disease phenotype where outcome is worse than the sum of disease parts (emphysema and fibrosis).

Le Zhang, Ryutaro Tanno, Mou-Cheng Xu, Jin Chen, Joseph Jacob, Olga Cicarrelli, Frederik Barkhof, Daniel C. Alexander

December 2020 Advances In Neural Information Processing Systems (NeurIPS)

Disentangling Human Error from Ground Truth in Segmentation of Medical Images

Recent years have seen increasing use of supervised learning methods for segmentation tasks. However, the predictive performance of these algorithms depends on the quality of labels. This problem is particularly pertinent in the medical image domain, where both the annotation cost and inter-observer variability are high. In a typical label acquisition process, different human experts provide their estimates of the “true” segmentation labels under the influence of their own biases and competence levels. Treating these noisy labels blindly as the ground truth limits the performance that automatic segmentation algorithms can achieve. In this work, we present a method for jointly learning, from purely noisy observations alone, the reliability of individual annotators and the true segmentation label distributions, using two coupled CNNs. The separation of the two is achieved by encouraging the estimated annotators to be maximally unreliable while achieving high fidelity with the noisy training data. We first define a toy segmentation dataset based on MNIST and study the properties of the proposed algorithm. We then demonstrate the utility of the method on three public medical imaging segmentation datasets with simulated (when necessary) and real diverse annotations: 1) MSLSC (multiple-sclerosis lesions); 2) BraTS (brain tumours); 3) LIDC-IDRI (lung abnormalities). In all cases, our method outperforms competing methods and relevant baselines particularly in cases where the number of annotations is small and the amount of disagreement is large. The experiments also show strong ability to capture the complex spatial characteristics of annotators’ mistakes. Our code is available at https: //github.com/moucheng2017/Learn_Noisy_Labels_Medical_Images.

Mou-Cheng Xu, Neil P. Oxtoby, Daniel C. Alexander, Joseph Jacob

September 2020 British Machine Vision Conference (BMVC)

Learning to Pay Attention to Mistakes

In convolutional neural network based medical image segmentation, the periphery of foreground regions representing malignant tissues may be disproportionately assigned as belonging to the background class of healthy tissues [18][21][24][12][4]. Misclassification of foreground pixels as the background class can lead to high false negative detection rates. In this paper, we propose a novel attention mechanism to directly address such high false negative rates, called Paying Attention to Mistakes. Our attention mechanism steers the models towards false positive identification, which counters the existing bias towards false negatives. The proposed mechanism has two complementary implementations: (a) “explicit” steering of the model to attend to a larger Effective Receptive Field on the foreground areas; (b) “implicit” steering towards false positives, by attending to a smaller Effective Receptive Field on the background areas. We validated our methods on three tasks: 1) binary dense prediction between vehicles and the background using CityScapes; 2) Enhanced Tumour Core segmentation with multi-modal MRI scans in BRATS2018; 3) segmenting stroke lesions using ultrasound images in ISLES2018. We compared our methods with state-of-the-art attention mechanisms in medical imaging, including self-attention, spatial-attention and spatial-channel mixed attention. Across all of the three different tasks, our models consistently outperform the baseline models in Intersection over Union (IoU) and/or Hausdorff Distance (HD). For instance, in the second task, the “explicit” implementation of our mechanism reduces the HD of the best baseline by more than 26%, whilst improving the IoU by more than 3%. We believe our proposed attention mechanism can benefit a wide range of medical and computer vision tasks, which suffer from over-detection of background.