Models, code, and papers for "Yuankai Huo":

4D Multi-atlas Label Fusion using Longitudinal Images

Aug 29, 2017
Yuankai Huo, Susan M. Resnick, Bennett A. Landman

Longitudinal reproducibility is an essential concern in automated medical image segmentation, yet has proven to be an elusive objective as manual brain structure tracings have shown more than 10% variability. To improve reproducibility, lon-gitudinal segmentation (4D) approaches have been investigated to reconcile tem-poral variations with traditional 3D approaches. In the past decade, multi-atlas la-bel fusion has become a state-of-the-art segmentation technique for 3D image and many efforts have been made to adapt it to a 4D longitudinal fashion. However, the previous methods were either limited by using application specified energy function (e.g., surface fusion and multi model fusion) or only considered tem-poral smoothness on two consecutive time points (t and t+1) under sparsity as-sumption. Therefore, a 4D multi-atlas label fusion theory for general label fusion purpose and simultaneously considering temporal consistency on all time points is appealing. Herein, we propose a novel longitudinal label fusion algorithm, called 4D joint label fusion (4DJLF), to incorporate the temporal consistency modeling via non-local patch-intensity covariance models. The advantages of 4DJLF include: (1) 4DJLF is under the general label fusion framework by simul-taneously incorporating the spatial and temporal covariance on all longitudinal time points. (2) The proposed algorithm is a longitudinal generalization of a lead-ing joint label fusion method (JLF) that has proven adaptable to a wide variety of applications. (3) The spatial temporal consistency of atlases is modeled in a prob-abilistic model inspired from both voting based and statistical fusion. The pro-posed approach improves the consistency of the longitudinal segmentation while retaining sensitivity compared with original JLF approach using the same set of atlases. The method is available online in open-source.

  Access Model/Code and Paper
Improved Stability of Whole Brain Surface Parcellation with Multi-Atlas Segmentation

Dec 02, 2017
Yuankai Huo, Shunxing Bao, Prasanna Parvathaneni, Bennett A. Landman

Whole brain segmentation and cortical surface parcellation are essential in understanding the anatomical-functional relationships of the brain. Multi-atlas segmentation has been regarded as one of the leading segmentation methods for the whole brain segmentation. In our recent work, the multi-atlas technique has been adapted to surface reconstruction using a method called Multi-atlas CRUISE (MaCRUISE). The MaCRUISE method not only performed consistent volume-surface analyses but also showed advantages on robustness compared with the FreeSurfer method. However, a detailed surface parcellation was not provided by MaCRUISE, which hindered the region of interest (ROI) based analyses on surfaces. Herein, the MaCRUISE surface parcellation (MaCRUISEsp) method is proposed to perform the surface parcellation upon the inner, central and outer surfaces that are reconstructed from MaCRUISE. MaCRUISEsp parcellates inner, central and outer surfaces with 98 cortical labels respectively using a volume segmentation based surface parcellation (VSBSP), following a topological correction step. To validate the performance of MaCRUISEsp, 21 scan-rescan magnetic resonance imaging (MRI) T1 volume pairs from the Kirby21 dataset were used to perform a reproducibility analyses. MaCRUISEsp achieved 0.948 on median Dice Similarity Coefficient (DSC) for central surfaces. Meanwhile, FreeSurfer achieved 0.905 DSC for inner surfaces and 0.881 DSC for outer surfaces, while the proposed method achieved 0.929 DSC for inner surfaces and 0.835 DSC for outer surfaces. Qualitatively, the results are encouraging, but are not directly comparable as the two approaches use different definitions of cortical labels.

* SPIE Medical Imaging 2018 

  Access Model/Code and Paper
Data-driven Probabilistic Atlases Capture Whole-brain Individual Variation

Jun 06, 2018
Yuankai Huo, Katherine Swett, Susan M. Resnick, Laurie E. Cutting, Bennett A. Landman

Probabilistic atlases provide essential spatial contextual information for image interpretation, Bayesian modeling, and algorithmic processing. Such atlases are typically constructed by grouping subjects with similar demographic information. Importantly, use of the same scanner minimizes inter-group variability. However, generalizability and spatial specificity of such approaches is more limited than one might like. Inspired by Commowick "Frankenstein's creature paradigm" which builds a personal specific anatomical atlas, we propose a data-driven framework to build a personal specific probabilistic atlas under the large-scale data scheme. The data-driven framework clusters regions with similar features using a point distribution model to learn different anatomical phenotypes. Regional structural atlases and corresponding regional probabilistic atlases are used as indices and targets in the dictionary. By indexing the dictionary, the whole brain probabilistic atlases adapt to each new subject quickly and can be used as spatial priors for visualization and processing. The novelties of this approach are (1) it provides a new perspective of generating personal specific whole brain probabilistic atlases (132 regions) under data-driven scheme across sites. (2) The framework employs the large amount of heterogeneous data (2349 images). (3) The proposed framework achieves low computational cost since only one affine registration and Pearson correlation operation are required for a new subject. Our method matches individual regions better with higher Dice similarity value when testing the probabilistic atlases. Importantly, the advantage the large-scale scheme is demonstrated by the better performance of using large-scale training data (1888 images) than smaller training set (720 images).

  Access Model/Code and Paper
The Value of Nullspace Tuning Using Partial Label Information

Mar 17, 2020
Colin B. Hansen, Vishwesh Nath, Diego A. Mesa, Yuankai Huo, Bennett A. Landman, Thomas A. Lasko

In semi-supervised learning, information from unlabeled examples is used to improve the model learned from labeled examples. But in some learning problems, partial label information can be inferred from otherwise unlabeled examples and used to further improve the model. In particular, partial label information exists when subsets of training examples are known to have the same label, even though the label itself is missing. By encouraging a model to give the same label to all such examples, we can potentially improve its performance. We call this encouragement \emph{Nullspace Tuning} because the difference vector between any pair of examples with the same label should lie in the nullspace of a linear model. In this paper, we investigate the benefit of using partial label information using a careful comparison framework over well-characterized public datasets. We show that the additional information provided by partial labels reduces test error over good semi-supervised methods usually by a factor of 2, up to a factor of 5.5 in the best case. We also show that adding Nullspace Tuning to the newer and state-of-the-art MixMatch method decreases its test error by up to a factor of 1.8.

  Access Model/Code and Paper
Reproducibility Evaluation of SLANT Whole Brain Segmentation Across Clinical Magnetic Resonance Imaging Protocols

Jan 07, 2019
Yunxi Xiong, Yuankai Huo, Jiachen Wang, L. Taylor Davis, Maureen McHugo, Bennett A. Landman

Whole brain segmentation on structural magnetic resonance imaging (MRI) is essential for understanding neuroanatomical-functional relationships. Traditionally, multi-atlas segmentation has been regarded as the standard method for whole brain segmentation. In past few years, deep convolutional neural network (DCNN) segmentation methods have demonstrated their advantages in both accuracy and computational efficiency. Recently, we proposed the spatially localized atlas network tiles (SLANT) method, which is able to segment a 3D MRI brain scan into 132 anatomical regions. Commonly, DCNN segmentation methods yield inferior performance under external validations, especially when the testing patterns were not presented in the training cohorts. Recently, we obtained a clinically acquired, multi-sequence MRI brain cohort with 1480 clinically acquired, de-identified brain MRI scans on 395 patients using seven different MRI protocols. Moreover, each subject has at least two scans from different MRI protocols. Herein, we assess the SLANT method's intra- and inter-protocol reproducibility. SLANT achieved less than 0.05 coefficient of variation (CV) for intra-protocol experiments and less than 0.15 CV for inter-protocol experiments. The results show that the SLANT method achieved high intra- and inter- protocol reproducibility.

* To appear in SPIE Medical Imaging 2019 

  Access Model/Code and Paper
Montage based 3D Medical Image Retrieval from Traumatic Brain Injury Cohort using Deep Convolutional Neural Network

Dec 10, 2018
Cailey I. Kerley, Yuankai Huo, Shikha Chaganti, Shunxing Bao, Mayur B. Patel, Bennett A. Landman

Brain imaging analysis on clinically acquired computed tomography (CT) is essential for the diagnosis, risk prediction of progression, and treatment of the structural phenotypes of traumatic brain injury (TBI). However, in real clinical imaging scenarios, entire body CT images (e.g., neck, abdomen, chest, pelvis) are typically captured along with whole brain CT scans. For instance, in a typical sample of clinical TBI imaging cohort, only ~15% of CT scans actually contain whole brain CT images suitable for volumetric brain analyses; the remaining are partial brain or non-brain images. Therefore, a manual image retrieval process is typically required to isolate the whole brain CT scans from the entire cohort. However, the manual image retrieval is time and resource consuming and even more difficult for the larger cohorts. To alleviate the manual efforts, in this paper we propose an automated 3D medical image retrieval pipeline, called deep montage-based image retrieval (dMIR), which performs classification on 2D montage images via a deep convolutional neural network. The novelty of the proposed method for image processing is to characterize the medical image retrieval task based on the montage images. In a cohort of 2000 clinically acquired TBI scans, 794 scans were used as training data, 206 scans were used as validation data, and the remaining 1000 scans were used as testing data. The proposed achieved accuracy=1.0, recall=1.0, precision=1.0, f1=1.0 for validation data, while achieved accuracy=0.988, recall=0.962, precision=0.962, f1=0.962 for testing data. Thus, the proposed dMIR is able to perform accurate CT whole brain image retrieval from large-scale clinical cohorts.

* Accepted for SPIE: Medical Imaging 2019 

  Access Model/Code and Paper
Adversarial Synthesis Learning Enables Segmentation Without Target Modality Ground Truth

Dec 20, 2017
Yuankai Huo, Zhoubing Xu, Shunxing Bao, Albert Assad, Richard G. Abramson, Bennett A. Landman

A lack of generalizability is one key limitation of deep learning based segmentation. Typically, one manually labels new training images when segmenting organs in different imaging modalities or segmenting abnormal organs from distinct disease cohorts. The manual efforts can be alleviated if one is able to reuse manual labels from one modality (e.g., MRI) to train a segmentation network for a new modality (e.g., CT). Previously, two stage methods have been proposed to use cycle generative adversarial networks (CycleGAN) to synthesize training images for a target modality. Then, these efforts trained a segmentation network independently using synthetic images. However, these two independent stages did not use the complementary information between synthesis and segmentation. Herein, we proposed a novel end-to-end synthesis and segmentation network (EssNet) to achieve the unpaired MRI to CT image synthesis and CT splenomegaly segmentation simultaneously without using manual labels on CT. The end-to-end EssNet achieved significantly higher median Dice similarity coefficient (0.9188) than the two stages strategy (0.8801), and even higher than canonical multi-atlas segmentation (0.9125) and ResNet method (0.9107), which used the CT manual labels.

* IEEE International Symposium on Biomedical Imaging (ISBI) 2018 

  Access Model/Code and Paper
Less is More: Simultaneous View Classification and Landmark Detection for Abdominal Ultrasound Images

Jun 04, 2018
Zhoubing Xu, Yuankai Huo, JinHyeong Park, Bennett Landman, Andy Milkowski, Sasa Grbic, Shaohua Zhou

An abdominal ultrasound examination, which is the most common ultrasound examination, requires substantial manual efforts to acquire standard abdominal organ views, annotate the views in texts, and record clinically relevant organ measurements. Hence, automatic view classification and landmark detection of the organs can be instrumental to streamline the examination workflow. However, this is a challenging problem given not only the inherent difficulties from the ultrasound modality, e.g., low contrast and large variations, but also the heterogeneity across tasks, i.e., one classification task for all views, and then one landmark detection task for each relevant view. While convolutional neural networks (CNN) have demonstrated more promising outcomes on ultrasound image analytics than traditional machine learning approaches, it becomes impractical to deploy multiple networks (one for each task) due to the limited computational and memory resources on most existing ultrasound scanners. To overcome such limits, we propose a multi-task learning framework to handle all the tasks by a single network. This network is integrated to perform view classification and landmark detection simultaneously; it is also equipped with global convolutional kernels, coordinate constraints, and a conditional adversarial module to leverage the performances. In an experimental study based on 187,219 ultrasound images, with the proposed simplified approach we achieve (1) view classification accuracy better than the agreement between two clinical experts and (2) landmark-based measurement errors on par with inter-user variability. The multi-task approach also benefits from sharing the feature extraction during the training process across all tasks and, as a result, outperforms the approaches that address each task individually.

* Accepted to MICCAI 2018 

  Access Model/Code and Paper
Lesion Harvester: Iteratively Mining Unlabeled Lesions and Hard-Negative Examples at Scale

Jan 28, 2020
Jinzheng Cai, Adam P. Harrison, Youjing Zheng, Ke Yan, Yuankai Huo, Jing Xiao, Lin Yang, Le Lu

Acquiring large-scale medical image data, necessary for training machine learning algorithms, is frequently intractable, due to prohibitive expert-driven annotation costs. Recent datasets extracted from hospital archives, e.g., DeepLesion, have begun to address this problem. However, these are often incompletely or noisily labeled, e.g., DeepLesion leaves over 50% of its lesions unlabeled. Thus, effective methods to harvest missing annotations are critical for continued progress in medical image analysis. This is the goal of our work, where we develop a powerful system to harvest missing lesions from the DeepLesion dataset at high precision. Accepting the need for some degree of expert labor to achieve high fidelity, we exploit a small fully-labeled subset of medical image volumes and use it to intelligently mine annotations from the remainder. To do this, we chain together a highly sensitive lesion proposal generator and a very selective lesion proposal classifier. While our framework is generic, we optimize our performance by proposing a 3D contextual lesion proposal generator and by using a multi-view multi-scale lesion proposal classifier. These produce harvested and hard-negative proposals, which we then re-use to finetune our proposal generator by using a novel hard negative suppression loss, continuing this process until no extra lesions are found. Extensive experimental analysis demonstrates that our method can harvest an additional 9,805 lesions while keeping precision above 90%. To demonstrate the benefits of our approach, we show that lesion detectors trained on our harvested lesions can significantly outperform the same variants only trained on the original annotations, with boost of average precision of 7% to 10%. We open source our annotations at

* This work has been submitted to the IEEE for possible publication 

  Access Model/Code and Paper
Fully Automatic Liver Attenuation Estimation Combing CNN Segmentation and Morphological Operations

Jun 29, 2019
Yuankai Huo, James G. Terry, Jiachen Wang, Sangeeta Nair, Thomas A. Lasko, Barry I. Freedman, J. Jeffery Carr, Bennett A. Landman

Manually tracing regions of interest (ROIs) within the liver is the de facto standard method for measuring liver attenuation on computed tomography (CT) in diagnosing nonalcoholic fatty liver disease (NAFLD). However, manual tracing is resource intensive. To address these limitations and to expand the availability of a quantitative CT measure of hepatic steatosis, we propose the automatic liver attenuation ROI-based measurement (ALARM) method for automated liver attenuation estimation. The ALARM method consists of two major stages: (1) deep convolutional neural network (DCNN)-based liver segmentation and (2) automated ROI extraction. First, liver segmentation was achieved using our previously developed SS-Net. Then, a single central ROI (center-ROI) and three circles ROI (periphery-ROI) were computed based on liver segmentation and morphological operations. The ALARM method is available as an open source Docker container ( subjects with 738 abdomen CT scans from the African American-Diabetes Heart Study (AA-DHS) were used for external validation (testing), independent from the training and validation cohort (100 clinically acquired CT abdominal scans).

* Medical Physics 

  Access Model/Code and Paper
Internal-transfer Weighting of Multi-task Learning for Lung Cancer Detection

Dec 16, 2019
Yiyuan Yang, Riqiang Gao, Yucheng Tang, Sanja L. Antic, Steve Deppen, Yuankai Huo, Kim L. Sandler, Pierre P. Massion, Bennett A. Landman

Recently, multi-task networks have shown to both offer additional estimation capabilities, and, perhaps more importantly, increased performance over single-task networks on a "main/primary" task. However, balancing the optimization criteria of multi-task networks across different tasks is an area of active exploration. Here, we extend a previously proposed 3D attention-based network with four additional multi-task subnetworks for the detection of lung cancer and four auxiliary tasks (diagnosis of asthma, chronic bronchitis, chronic obstructive pulmonary disease, and emphysema). We introduce and evaluate a learning policy, Periodic Focusing Learning Policy (PFLP), that alternates the dominance of tasks throughout the training. To improve performance on the primary task, we propose an Internal-Transfer Weighting (ITW) strategy to suppress the loss functions on auxiliary tasks for the final stages of training. To evaluate this approach, we examined 3386 patients (single scan per patient) from the National Lung Screening Trial (NLST) and de-identified data from the Vanderbilt Lung Screening Program, with a 2517/277/592 (scans) split for training, validation, and testing. Baseline networks include a single-task strategy and a multi-task strategy without adaptive weights (PFLP/ITW), while primary experiments are multi-task trials with either PFLP or ITW or both. On the test set for lung cancer prediction, the baseline single-task network achieved prediction AUC of 0.8080 and the multi-task baseline failed to converge (AUC 0.6720). However, applying PFLP helped multi-task network clarify and achieved test set lung cancer prediction AUC of 0.8402. Furthermore, our ITW technique boosted the PFLP enabled multi-task network and achieved an AUC of 0.8462 (McNemar test, p < 0.01).

* Accepted by Medical Imaging, SPIE2020 

  Access Model/Code and Paper
Generalizing Deep Whole Brain Segmentation for Pediatric and Post-Contrast MRI with Augmented Transfer Learning

Aug 13, 2019
Camilo Bermudez, Justin Blaber, Samuel W. Remedios, Jess E. Reynolds, Catherine Lebel, Maureen McHugo, Stephan Heckers, Yuankai Huo, Bennett A. Landman

Generalizability is an important problem in deep neural networks, especially in the context of the variability of data acquisition in clinical magnetic resonance imaging (MRI). Recently, the Spatially Localized Atlas Network Tiles (SLANT) approach has been shown to effectively segment whole brain non-contrast T1w MRI with 132 volumetric labels. Enhancing generalizability of SLANT would enable broader application of volumetric assessment in multi-site studies. Transfer learning (TL) is commonly used to update the neural network weights for local factors; yet, it is commonly recognized to risk degradation of performance on the original validation/test cohorts. Here, we explore TL by data augmentation to address these concerns in the context of adapting SLANT to anatomical variation and scanning protocol. We consider two datasets: First, we optimize for age with 30 T1w MRI of young children with manually corrected volumetric labels, and accuracy of automated segmentation defined relative to the manually provided truth. Second, we optimize for acquisition with 36 paired datasets of pre- and post-contrast clinically acquired T1w MRI, and accuracy of the post-contrast segmentations assessed relative to the pre-contrast automated assessment. For both studies, we augment the original TL step of SLANT with either only the new data or with both original and new data. Over baseline SLANT, both approaches yielded significantly improved performance (signed rank tests; pediatric: 0.89 vs. 0.82 DSC, p<0.001; contrast: 0.80 vs 0.76, p<0.001). The performance on the original test set decreased with the new-data only transfer learning approach, so data augmentation was superior to strict transfer learning.

  Access Model/Code and Paper
Coronary Calcium Detection using 3D Attention Identical Dual Deep Network Based on Weakly Supervised Learning

Nov 10, 2018
Yuankai Huo, James G. Terry, Jiachen Wang, Vishwesh Nath, Camilo Bermudez, Shunxing Bao, Prasanna Parvathaneni, J. Jeffery Carr, Bennett A. Landman

Coronary artery calcium (CAC) is biomarker of advanced subclinical coronary artery disease and predicts myocardial infarction and death prior to age 60 years. The slice-wise manual delineation has been regarded as the gold standard of coronary calcium detection. However, manual efforts are time and resource consuming and even impracticable to be applied on large-scale cohorts. In this paper, we propose the attention identical dual network (AID-Net) to perform CAC detection using scan-rescan longitudinal non-contrast CT scans with weakly supervised attention by only using per scan level labels. To leverage the performance, 3D attention mechanisms were integrated into the AID-Net to provide complementary information for classification tasks. Moreover, the 3D Gradient-weighted Class Activation Mapping (Grad-CAM) was also proposed at the testing stage to interpret the behaviors of the deep neural network. 5075 non-contrast chest CT scans were used as training, validation and testing datasets. Baseline performance was assessed on the same cohort. From the results, the proposed AID-Net achieved the superior performance on classification accuracy (0.9272) and AUC (0.9627).

* Accepted by SPIE medical imaging 2019 

  Access Model/Code and Paper
Spatially Localized Atlas Network Tiles Enables 3D Whole Brain Segmentation from Limited Data

Jun 05, 2018
Yuankai Huo, Zhoubing Xu, Katherine Aboud, Prasanna Parvathaneni, Shunxing Bao, Camilo Bermudez, Susan M. Resnick, Laurie E. Cutting, Bennett A. Landman

Whole brain segmentation on a structural magnetic resonance imaging (MRI) is essential in non-invasive investigation for neuroanatomy. Historically, multi-atlas segmentation (MAS) has been regarded as the de facto standard method for whole brain segmentation. Recently, deep neural network approaches have been applied to whole brain segmentation by learning random patches or 2D slices. Yet, few previous efforts have been made on detailed whole brain segmentation using 3D networks due to the following challenges: (1) fitting entire whole brain volume into 3D networks is restricted by the current GPU memory, and (2) the large number of targeting labels (e.g., > 100 labels) with limited number of training 3D volumes (e.g., < 50 scans). In this paper, we propose the spatially localized atlas network tiles (SLANT) method to distribute multiple independent 3D fully convolutional networks to cover overlapped sub-spaces in a standard atlas space. This strategy simplifies the whole brain learning task to localized sub-tasks, which was enabled by combing canonical registration and label fusion techniques with deep learning. To address the second challenge, auxiliary labels on 5111 initially unlabeled scans were created by MAS for pre-training. From empirical validation, the state-of-the-art MAS method achieved mean Dice value of 0.76, 0.71, and 0.68, while the proposed method achieved 0.78, 0.73, and 0.71 on three validation cohorts. Moreover, the computational time reduced from > 30 hours using MAS to ~15 minutes using the proposed method. The source code is available online

* To appear in MICCAI2018 

  Access Model/Code and Paper
Deep Multi-task Prediction of Lung Cancer and Cancer-free Progression from Censored Heterogenous Clinical Imaging

Nov 12, 2019
Riqiang Gao, Lingfeng Li, Yucheng Tang, Sanja L. Antic, Alexis B. Paulson, Yuankai Huo, Kim L. Sandler, Pierre P. Massion, Bennett A. Landman

Annual low dose computed tomography (CT) lung screening is currently advised for individuals at high risk of lung cancer (e.g., heavy smokers between 55 and 80 years old). The recommended screening practice significantly reduces all-cause mortality, but the vast majority of screening results are negative for cancer. If patients at very low risk could be identified based on individualized, image-based biomarkers, the health care resources could be more efficiently allocated to higher risk patients and reduce overall exposure to ionizing radiation. In this work, we propose a multi-task (diagnosis and prognosis) deep convolutional neural network to improve the diagnostic accuracy over a baseline model while simultaneously estimating a personalized cancer-free progression time (CFPT). A novel Censored Regression Loss (CRL) is proposed to perform weakly supervised regression so that even single negative screening scans can provide small incremental value. Herein, we study 2287 scans from 1433 de-identified patients from the Vanderbilt Lung Screening Program (VLSP) and Molecular Characterization Laboratories (MCL) cohorts. Using five-fold cross-validation, we train a 3D attention-based network under two scenarios: (1) single-task learning with only classification, and (2) multi-task learning with both classification and regression. The single-task learning leads to a higher AUC compared with the Kaggle challenge winner pre-trained model (0.878 v. 0.856), and multi-task learning significantly improves the single-task one (AUC 0.895, p<0.01, McNemar test). In summary, the image-based predicted CFPT can be used in follow-up year lung cancer prediction and data assessment.

* 8 pages, 5 figures, SPIE 2020 Medical Imaging, oral presentation 

  Access Model/Code and Paper
Lung Cancer Detection using Co-learning from Chest CT Images and Clinical Demographics

Feb 21, 2019
Jiachen Wang, Riqiang Gao, Yuankai Huo, Shunxing Bao, Yunxi Xiong, Sanja L. Antic, Travis J. Osterman, Pierre P. Massion, Bennett A. Landman

Early detection of lung cancer is essential in reducing mortality. Recent studies have demonstrated the clinical utility of low-dose computed tomography (CT) to detect lung cancer among individuals selected based on very limited clinical information. However, this strategy yields high false positive rates, which can lead to unnecessary and potentially harmful procedures. To address such challenges, we established a pipeline that co-learns from detailed clinical demographics and 3D CT images. Toward this end, we leveraged data from the Consortium for Molecular and Cellular Characterization of Screen-Detected Lesions (MCL), which focuses on early detection of lung cancer. A 3D attention-based deep convolutional neural net (DCNN) is proposed to identify lung cancer from the chest CT scan without prior anatomical location of the suspicious nodule. To improve upon the non-invasive discrimination between benign and malignant, we applied a random forest classifier to a dataset integrating clinical information to imaging data. The results show that the AUC obtained from clinical demographics alone was 0.635 while the attention network alone reached an accuracy of 0.687. In contrast when applying our proposed pipeline integrating clinical and imaging variables, we reached an AUC of 0.787 on the testing dataset. The proposed network both efficiently captures anatomical information for classification and also generates attention maps that explain the features that drive performance.

* SPIE Medical Image, oral presentation 

  Access Model/Code and Paper
SynSeg-Net: Synthetic Segmentation Without Target Modality Ground Truth

Oct 15, 2018
Yuankai Huo, Zhoubing Xu, Hyeonsoo Moon, Shunxing Bao, Albert Assad, Tamara K. Moyo, Michael R. Savona, Richard G. Abramson, Bennett A. Landman

A key limitation of deep convolutional neural networks (DCNN) based image segmentation methods is the lack of generalizability. Manually traced training images are typically required when segmenting organs in a new imaging modality or from distinct disease cohort. The manual efforts can be alleviated if the manually traced images in one imaging modality (e.g., MRI) are able to train a segmentation network for another imaging modality (e.g., CT). In this paper, we propose an end-to-end synthetic segmentation network (SynSeg-Net) to train a segmentation network for a target imaging modality without having manual labels. SynSeg-Net is trained by using (1) unpaired intensity images from source and target modalities, and (2) manual labels only from source modality. SynSeg-Net is enabled by the recent advances of cycle generative adversarial networks (CycleGAN) and DCNN. We evaluate the performance of the SynSeg-Net on two experiments: (1) MRI to CT splenomegaly synthetic segmentation for abdominal images, and (2) CT to MRI total intracranial volume synthetic segmentation (TICV) for brain images. The proposed end-to-end approach achieved superior performance to two stage methods. Moreover, the SynSeg-Net achieved comparable performance to the traditional segmentation network using target modality labels in certain scenarios. The source code of SynSeg-Net is publicly available (

* Accepted by IEEE Transactions on Medical Imaging (TMI) 

  Access Model/Code and Paper
3D Whole Brain Segmentation using Spatially Localized Atlas Network Tiles

Mar 28, 2019
Yuankai Huo, Zhoubing Xu, Yunxi Xiong, Katherine Aboud, Prasanna Parvathaneni, Shunxing Bao, Camilo Bermudez, Susan M. Resnick, Laurie E. Cutting, Bennett A. Landman

Detailed whole brain segmentation is an essential quantitative technique, which provides a non-invasive way of measuring brain regions from a structural magnetic resonance imaging (MRI). Recently, deep convolution neural network (CNN) has been applied to whole brain segmentation. However, restricted by current GPU memory, 2D based methods, downsampling based 3D CNN methods, and patch-based high-resolution 3D CNN methods have been the de facto standard solutions. 3D patch-based high resolution methods typically yield superior performance among CNN approaches on detailed whole brain segmentation (>100 labels), however, whose performance are still commonly inferior compared with multi-atlas segmentation methods (MAS) due to the following challenges: (1) a single network is typically used to learn both spatial and contextual information for the patches, (2) limited manually traced whole brain volumes are available (typically less than 50) for training a network. In this work, we propose the spatially localized atlas network tiles (SLANT) method to distribute multiple independent 3D fully convolutional networks (FCN) for high-resolution whole brain segmentation. To address the first challenge, multiple spatially distributed networks were used in the SLANT method, in which each network learned contextual information for a fixed spatial location. To address the second challenge, auxiliary labels on 5111 initially unlabeled scans were created by multi-atlas segmentation for training. Since the method integrated multiple traditional medical image processing methods with deep learning, we developed a containerized pipeline to deploy the end-to-end solution. From the results, the proposed method achieved superior performance compared with multi-atlas segmentation methods, while reducing the computational time from >30 hours to 15 minutes (

  Access Model/Code and Paper
Splenomegaly Segmentation using Global Convolutional Kernels and Conditional Generative Adversarial Networks

Dec 02, 2017
Yuankai Huo, Zhoubing Xu, Shunxing Bao, Camilo Bermudez, Andrew J. Plassard, Jiaqi Liu, Yuang Yao, Albert Assad, Richard G. Abramson, Bennett A. Landman

Spleen volume estimation using automated image segmentation technique may be used to detect splenomegaly (abnormally enlarged spleen) on Magnetic Resonance Imaging (MRI) scans. In recent years, Deep Convolutional Neural Networks (DCNN) segmentation methods have demonstrated advantages for abdominal organ segmentation. However, variations in both size and shape of the spleen on MRI images may result in large false positive and false negative labeling when deploying DCNN based methods. In this paper, we propose the Splenomegaly Segmentation Network (SSNet) to address spatial variations when segmenting extraordinarily large spleens. SSNet was designed based on the framework of image-to-image conditional generative adversarial networks (cGAN). Specifically, the Global Convolutional Network (GCN) was used as the generator to reduce false negatives, while the Markovian discriminator (PatchGAN) was used to alleviate false positives. A cohort of clinically acquired 3D MRI scans (both T1 weighted and T2 weighted) from patients with splenomegaly were used to train and test the networks. The experimental results demonstrated that a mean Dice coefficient of 0.9260 and a median Dice coefficient of 0.9262 using SSNet on independently tested MRI volumes of patients with splenomegaly.

* SPIE Medical Imaging 2018 

  Access Model/Code and Paper