Models, code, and papers for "Lun Zhang":

An Empirical Study on Leveraging Scene Graphs for Visual Question Answering

Jul 28, 2019
Cheng Zhang, Wei-Lun Chao, Dong Xuan

Visual question answering (Visual QA) has attracted significant attention these years. While a variety of algorithms have been proposed, most of them are built upon different combinations of image and language features as well as multi-modal attention and fusion. In this paper, we investigate an alternative approach inspired by conventional QA systems that operate on knowledge graphs. Specifically, we investigate the use of scene graphs derived from images for Visual QA: an image is abstractly represented by a graph with nodes corresponding to object entities and edges to object relationships. We adapt the recently proposed graph network (GN) to encode the scene graph and perform structured reasoning according to the input question. Our empirical studies demonstrate that scene graphs can already capture essential information of images and graph networks have the potential to outperform state-of-the-art Visual QA algorithms but with a much cleaner architecture. By analyzing the features generated by GNs we can further interpret the reasoning process, suggesting a promising direction towards explainable Visual QA.

* Accepted as oral presentation at BMVC 2019 

  Access Model/Code and Paper
Auto-Context R-CNN

Jul 08, 2018
Bo Li, Tianfu Wu, Lun Zhang, Rufeng Chu

Region-based convolutional neural networks (R-CNN)~\cite{fast_rcnn,faster_rcnn,mask_rcnn} have largely dominated object detection. Operators defined on RoIs (Region of Interests) play an important role in R-CNNs such as RoIPooling~\cite{fast_rcnn} and RoIAlign~\cite{mask_rcnn}. They all only utilize information inside RoIs for RoI prediction, even with their recent deformable extensions~\cite{deformable_cnn}. Although surrounding context is well-known for its importance in object detection, it has yet been integrated in R-CNNs in a flexible and effective way. Inspired by the auto-context work~\cite{auto_context} and the multi-class object layout work~\cite{nms_context}, this paper presents a generic context-mining RoI operator (i.e., \textit{RoICtxMining}) seamlessly integrated in R-CNNs, and the resulting object detection system is termed \textbf{Auto-Context R-CNN} which is trained end-to-end. The proposed RoICtxMining operator is a simple yet effective two-layer extension of the RoIPooling or RoIAlign operator. Centered at an object-RoI, it creates a $3\times 3$ layout to mine contextual information adaptively in the $8$ surrounding context regions on-the-fly. Within each of the $8$ context regions, a context-RoI is mined in term of discriminative power and its RoIPooling / RoIAlign features are concatenated with the object-RoI for final prediction. \textit{The proposed Auto-Context R-CNN is robust to occlusion and small objects, and shows promising vulnerability for adversarial attacks without being adversarially-trained.} In experiments, it is evaluated using RoIPooling as the backbone and shows competitive results on Pascal VOC, Microsoft COCO, and KITTI datasets (including $6.9\%$ mAP improvements over the R-FCN~\cite{rfcn} method on COCO \textit{test-dev} dataset and the first place on both KITTI pedestrian and cyclist detection as of this submission).

* Rejected by ECCV18 

  Access Model/Code and Paper
Video Summarization with Long Short-term Memory

Jul 29, 2016
Ke Zhang, Wei-Lun Chao, Fei Sha, Kristen Grauman

We propose a novel supervised learning technique for summarizing videos by automatically selecting keyframes or key subshots. Casting the problem as a structured prediction problem on sequential data, our main idea is to use Long Short-Term Memory (LSTM), a special type of recurrent neural networks to model the variable-range dependencies entailed in the task of video summarization. Our learning models attain the state-of-the-art results on two benchmark video datasets. Detailed analysis justifies the design of the models. In particular, we show that it is crucial to take into consideration the sequential structures in videos and model them. Besides advances in modeling techniques, we introduce techniques to address the need of a large number of annotated data for training complex learning models. There, our main idea is to exploit the existence of auxiliary annotated video datasets, albeit heterogeneous in visual styles and contents. Specifically, we show domain adaptation techniques can improve summarization by reducing the discrepancies in statistical properties across those datasets.

* To appear in ECCV 2016 

  Access Model/Code and Paper
Summary Transfer: Exemplar-based Subset Selection for Video Summarization

Apr 29, 2016
Ke Zhang, Wei-Lun Chao, Fei Sha, Kristen Grauman

Video summarization has unprecedented importance to help us digest, browse, and search today's ever-growing video collections. We propose a novel subset selection technique that leverages supervision in the form of human-created summaries to perform automatic keyframe-based video summarization. The main idea is to nonparametrically transfer summary structures from annotated videos to unseen test videos. We show how to extend our method to exploit semantic side information about the video's category/genre to guide the transfer process by those training videos semantically consistent with the test input. We also show how to generalize our method to subshot-based summarization, which not only reduces computational costs but also provides more flexible ways of defining visual similarity across subshots spanning several frames. We conduct extensive evaluation on several benchmarks and demonstrate promising results, outperforming existing methods in several settings.

* CVPR 2016 camera ready 

  Access Model/Code and Paper
DANE: Domain Adaptive Network Embedding

Jun 03, 2019
Yizhou Zhang, Guojie Song, Lun Du, Shuwen Yang, Yilun Jin

Recent works reveal that network embedding techniques enable many machine learning models to handle diverse downstream tasks on graph structured data. However, as previous methods usually focus on learning embeddings for a single network, they can not learn representations transferable on multiple networks. Hence, it is important to design a network embedding algorithm that supports downstream model transferring on different networks, known as domain adaptation. In this paper, we propose a novel Domain Adaptive Network Embedding framework, which applies graph convolutional network to learn transferable embeddings. In DANE, nodes from multiple networks are encoded to vectors via a shared set of learnable parameters so that the vectors share an aligned embedding space. The distribution of embeddings on different networks are further aligned by adversarial learning regularization. In addition, DANE's advantage in learning transferable network embedding can be guaranteed theoretically. Extensive experiments reflect that the proposed framework outperforms other state-of-the-art network embedding baselines in cross-network domain adaptation tasks.

* 7 pages, 4 figures, accepted by IJCAI 2019 

  Access Model/Code and Paper
Object Detection via Aspect Ratio and Context Aware Region-based Convolutional Networks

Mar 22, 2017
Bo Li, Tianfu Wu, Shuai Shao, Lun Zhang, Rufeng Chu

Jointly integrating aspect ratio and context has been extensively studied and shown performance improvement in traditional object detection systems such as the DPMs. It, however, has been largely ignored in deep neural network based detection systems. This paper presents a method of integrating a mixture of object models and region-based convolutional networks for accurate object detection. Each mixture component accounts for both object aspect ratio and multi-scale contextual information explicitly: (i) it exploits a mixture of tiling configurations in the RoI pooling to remedy the warping artifacts caused by a single type RoI pooling (e.g., with equally-sized 7 x 7 cells), and to respect the underlying object shapes more; (ii) it "looks from both the inside and the outside of a RoI" by incorporating contextual information at two scales: global context pooled from the whole image and local context pooled from the surrounding of a RoI. To facilitate accurate detection, this paper proposes a multi-stage detection scheme for integrating the mixture of object models, which utilizes the detection results of the model at the previous stage as the proposals for the current in both training and testing. The proposed method is called the aspect ratio and context aware region-based convolutional network (ARC-R-CNN). In experiments, ARC-R-CNN shows very competitive results with Faster R-CNN [41] and R-FCN [10] on two datasets: the PASCAL VOC and the Microsoft COCO. It obtains significantly better mAP performance using high IoU thresholds on both datasets.

  Access Model/Code and Paper
Inference for Network Structure and Dynamics from Time Series Data via Graph Neural Network

Jan 18, 2020
Mengyuan Chen, Jiang Zhang, Zhang Zhang, Lun Du, Qiao Hu, Shuo Wang, Jiaqi Zhu

Network structures in various backgrounds play important roles in social, technological, and biological systems. However, the observable network structures in real cases are often incomplete or unavailable due to measurement errors or private protection issues. Therefore, inferring the complete network structure is useful for understanding complex systems. The existing studies have not fully solved the problem of inferring network structure with partial or no information about connections or nodes. In this paper, we tackle the problem by utilizing time series data generated by network dynamics. We regard the network inference problem based on dynamical time series data as a problem of minimizing errors for predicting future states and proposed a novel data-driven deep learning model called Gumbel Graph Network (GGN) to solve the two kinds of network inference problems: Network Reconstruction and Network Completion. For the network reconstruction problem, the GGN framework includes two modules: the dynamics learner and the network generator. For the network completion problem, GGN adds a new module called the States Learner to infer missing parts of the network. We carried out experiments on discrete and continuous time series data. The experiments show that our method can reconstruct up to 100% network structure on the network reconstruction task. While the model can also infer the unknown parts of the structure with up to 90% accuracy when some nodes are missing. And the accuracy decays with the increase of the fractions of missing nodes. Our framework may have wide application areas where the network structure is hard to obtained and the time series data is rich.

  Access Model/Code and Paper
Neural Abstract Style Transfer for Chinese Traditional Painting

Dec 13, 2018
Bo Li, Caiming Xiong, Tianfu Wu, Yu Zhou, Lun Zhang, Rufeng Chu

Chinese traditional painting is one of the most historical artworks in the world. It is very popular in Eastern and Southeast Asia due to being aesthetically appealing. Compared with western artistic painting, it is usually more visually abstract and textureless. Recently, neural network based style transfer methods have shown promising and appealing results which are mainly focused on western painting. It remains a challenging problem to preserve abstraction in neural style transfer. In this paper, we present a Neural Abstract Style Transfer method for Chinese traditional painting. It learns to preserve abstraction and other style jointly end-to-end via a novel MXDoG-guided filter (Modified version of the eXtended Difference-of-Gaussians) and three fully differentiable loss terms. To the best of our knowledge, there is little work study on neural style transfer of Chinese traditional painting. To promote research on this direction, we collect a new dataset with diverse photo-realistic images and Chinese traditional paintings. In experiments, the proposed method shows more appealing stylized results in transferring the style of Chinese traditional painting than state-of-the-art neural style transfer methods.

* Conference: ACCV 2018. Project Page: 

  Access Model/Code and Paper
Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba

May 24, 2018
Jizhe Wang, Pipei Huang, Huan Zhao, Zhibo Zhang, Binqiang Zhao, Dik Lun Lee

Recommender systems (RSs) have been the most important technology for increasing the business in Taobao, the largest online consumer-to-consumer (C2C) platform in China. The billion-scale data in Taobao creates three major challenges to Taobao's RS: scalability, sparsity and cold start. In this paper, we present our technical solutions to address these three challenges. The methods are based on the graph embedding framework. We first construct an item graph from users' behavior history. Each item is then represented as a vector using graph embedding. The item embeddings are employed to compute pairwise similarities between all items, which are then used in the recommendation process. To alleviate the sparsity and cold start problems, side information is incorporated into the embedding framework. We propose two aggregation methods to integrate the embeddings of items and the corresponding side information. Experimental results from offline experiments show that methods incorporating side information are superior to those that do not. Further, we describe the platform upon which the embedding methods are deployed and the workflow to process the billion-scale data in Taobao. Using online A/B test, we show that the online Click-Through-Rate (CTRs) are improved comparing to the previous recommendation methods widely used in Taobao, further demonstrating the effectiveness and feasibility of our proposed methods in Taobao's live production environment.

* 10 pages, 8 figures 

  Access Model/Code and Paper
An Efficient Approach to Informative Feature Extraction from Multimodal Data

Nov 22, 2018
Lichen Wang, Jiaxiang Wu, Shao-Lun Huang, Lizhong Zheng, Xiangxiang Xu, Lin Zhang, Junzhou Huang

One primary focus in multimodal feature extraction is to find the representations of individual modalities that are maximally correlated. As a well-known measure of dependence, the Hirschfeld-Gebelein-R\'{e}nyi (HGR) maximal correlation becomes an appealing objective because of its operational meaning and desirable properties. However, the strict whitening constraints formalized in the HGR maximal correlation limit its application. To address this problem, this paper proposes Soft-HGR, a novel framework to extract informative features from multiple data modalities. Specifically, our framework prevents the "hard" whitening constraints, while simultaneously preserving the same feature geometry as in the HGR maximal correlation. The objective of Soft-HGR is straightforward, only involving two inner products, which guarantees the efficiency and stability in optimization. We further generalize the framework to handle more than two modalities and missing modalities. When labels are partially available, we enhance the discriminative power of the feature representations by making a semi-supervised adaptation. Empirical evaluation implies that our approach learns more informative feature mappings and is more efficient to optimize.

* Accepted to AAAI 2019 

  Access Model/Code and Paper
Boundary-Preserved Deep Denoising of the Stochastic Resonance Enhanced Multiphoton Images

Apr 15, 2019
Sheng-Yong Niu, Lun-Zhang Guo, Yue Li, Tzung-Dau Wang, Yu Tsao, Tzu-Ming Liu

As the rapid growth of high-speed and deep-tissue imaging in biomedical research, it is urgent to find a robust and effective denoising method to retain morphological features for further texture analysis and segmentation. Conventional denoising filters and models can easily suppress perturbative noises in high contrast images. However, for low photon budget multi-photon images, high detector gain will not only boost signals, but also bring huge background noises. In such stochastic resonance regime of imaging, sub-threshold signals may be detectable with the help of noises. Therefore, a denoising filter that can smartly remove noises without sacrificing the important cellular features such as cell boundaries is highly desired. In this paper, we propose a convolutional neural network based autoencoder method, Fully Convolutional Deep Denoising Autoencoder (DDAE), to improve the quality of Three-Photon Fluorescence (3PF) and Third Harmonic Generation (THG) microscopy images. The average of the acquired 200 images of a given location served as the low-noise answer for DDAE training. Compared with other widely used denoising methods, our DDAE model shows better signal-to-noise ratio (26.6 and 29.9 for 3PF and THG, respectively), structure similarity (0.86 and 0.87 for 3PF and THG, respectively), and preservation of nuclear or cellular boundaries.

  Access Model/Code and Paper
Corticospinal Tract (CST) reconstruction based on fiber orientation distributions(FODs) tractography

Apr 23, 2019
Youshan Zhang

The Corticospinal Tract (CST) is a part of pyramidal tract (PT), and it can innervate the voluntary movement of skeletal muscle through spinal interneurons (the 4th layer of the Rexed gray board layers), and anterior horn motorneurons (which control trunk and proximal limb muscles). Spinal cord injury (SCI) is a highly disabling disease often caused by traffic accidents. The recovery of CST and the functional reconstruction of spinal anterior horn motor neurons play an essential role in the treatment of SCI. However, the localization and reconstruction of CST are still challenging issues; the accuracy of the geometric reconstruction can directly affect the results of the surgery. The main contribution of this paper is the reconstruction of the CST based on the fiber orientation distributions (FODs) tractography. Differing from tensor-based tractography in which the primary direction is a determined orientation, the direction of FODs tractography is determined by the probability. The spherical harmonics (SPHARM) can be used to approximate the efficiency of FODs tractography. We manually delineate the three ROIs (the posterior limb of the internal capsule, the cerebral peduncle, and the anterior pontine area) by the ITK-SNAP software, and use the pipeline software to reconstruct both the left and right sides of the CST fibers. Our results demonstrate that FOD-based tractography can show more and correct anatomical CST fiber bundles.

* 2018 IEEE 18th International Conference on Bioinformatics and Bioengineering (BIBE), Taichung, 2018, pp. 305-310 

  Access Model/Code and Paper
Pairwise Neural Networks (PairNets) with Low Memory for Fast On-Device Applications

Feb 10, 2020
Luna M. Zhang

A traditional artificial neural network (ANN) is normally trained slowly by a gradient descent algorithm, such as the backpropagation algorithm, since a large number of hyperparameters of the ANN need to be fine-tuned with many training epochs. Since a large number of hyperparameters of a deep neural network, such as a convolutional neural network, occupy much memory, a memory-inefficient deep learning model is not ideal for real-time Internet of Things (IoT) applications on various devices, such as mobile phones. Thus, it is necessary to develop fast and memory-efficient Artificial Intelligence of Things (AIoT) systems for real-time on-device applications. We created a novel wide and shallow 4-layer ANN called "Pairwise Neural Network" ("PairNet") with high-speed non-gradient-descent hyperparameter optimization. The PairNet is trained quickly with only one epoch since its hyperparameters are directly optimized one-time via simply solving a system of linear equations by using the multivariate least squares fitting method. In addition, an n-input space is partitioned into many n-input data subspaces, and a local PairNet is built in a local n-input subspace. This divide-and-conquer approach can train the local PairNet using specific local features to improve model performance. Simulation results indicate that the three PairNets with incremental learning have smaller average prediction mean squared errors, and achieve much higher speeds than traditional ANNs. An important future work is to develop better and faster non-gradient-descent hyperparameter optimization algorithms to generate effective, fast, and memory-efficient PairNets with incremental learning on optimal subspaces for real-time AIoT on-device applications.

* arXiv admin note: text overlap with arXiv:2001.08886 

  Access Model/Code and Paper
PairNets: Novel Fast Shallow Artificial Neural Networks on Partitioned Subspaces

Jan 24, 2020
Luna M. Zhang

Traditionally, an artificial neural network (ANN) is trained slowly by a gradient descent algorithm such as the backpropagation algorithm since a large number of hyperparameters of the ANN need to be fine-tuned with many training epochs. To highly speed up training, we created a novel shallow 4-layer ANN called "Pairwise Neural Network" ("PairNet") with high-speed hyperparameter optimization. In addition, a value of each input is partitioned into multiple intervals, and then an n-dimensional space is partitioned into M n-dimensional subspaces. M local PairNets are built in M partitioned local n-dimensional subspaces. A local PairNet is trained very quickly with only one epoch since its hyperparameters are directly optimized one-time via simply solving a system of linear equations by using the multivariate least squares fitting method. Simulation results for three regression problems indicated that the PairNet achieved much higher speeds and lower average testing mean squared errors (MSEs) for the three cases, and lower average training MSEs for two cases than the traditional ANNs. A significant future work is to develop better and faster optimization algorithms based on intelligent methods and parallel computing methods to optimize both partitioned subspaces and hyperparameters to build the fast and effective PairNets for applications in big data mining and real-time machine learning.

  Access Model/Code and Paper
A New Compensatory Genetic Algorithm-Based Method for Effective Compressed Multi-function Convolutional Neural Network Model Selection with Multi-Objective Optimization

Jun 08, 2019
Luna M. Zhang

In recent years, there have been many popular Convolutional Neural Networks (CNNs), such as Google's Inception-V4, that have performed very well for various image classification problems. These commonly used CNN models usually use the same activation function, such as RELU, for all neurons in the convolutional layers; they are "Single-function CNNs." However, SCNNs may not always be optimal. Thus, a "Multi-function CNN" (MCNN), which uses different activation functions for different neurons, has been shown to outperform a SCNN. Also, CNNs typically have very large architectures that use a lot of memory and need a lot of data in order to be trained well. As a result, they tend to have very high training and prediction times too. An important research problem is how to automatically and efficiently find the best CNN with both high classification performance and compact architecture with high training and prediction speeds, small power usage, and small memory size for any image classification problem. It is very useful to intelligently find an effective, fast, energy-efficient, and memory-efficient "Compressed Multi-function CNN" (CMCNN) from a large number of candidate MCNNs. A new compensatory algorithm using a new genetic algorithm (GA) is created to find the best CMCNN with an ideal compensation between performance and architecture size. The optimal CMCNN has the best performance and the smallest architecture size. Simulations using the CIFAR10 dataset showed that the new compensatory algorithm could find CMCNNs that could outperform non-compressed MCNNs in terms of classification performance (F1-score), speed, power usage, and memory usage. Other effective, fast, power-efficient, and memory-efficient CMCNNs based on popular CNN architectures will be developed for image classification problems in important real-world applications, such as brain informatics and biomedical imaging.

  Access Model/Code and Paper
Effective, Fast, and Memory-Efficient Compressed Multi-function Convolutional Neural Networks for More Accurate Medical Image Classification

Nov 29, 2018
Luna M. Zhang

Convolutional Neural Networks (CNNs) usually use the same activation function, such as RELU, for all convolutional layers. There are performance limitations of just using RELU. In order to achieve better classification performance, reduce training and testing times, and reduce power consumption and memory usage, a new "Compressed Multi-function CNN" is developed. Google's Inception-V4, for example, is a very deep CNN that consists of 4 Inception-A blocks, 7 Inception-B blocks, and 3 Inception-C blocks. RELU is used for all convolutional layers. A new "Compressed Multi-function Inception-V4" (CMI) that can use different activation functions is created with k Inception-A blocks, m Inception-B blocks, and n Inception-C blocks where k in {1, 2, 3, 4}, m in {1, 2, 3, 4, 5, 6, 7}, n in {1, 2, 3}, and (k+m+n)<14. For performance analysis, a dataset for classifying brain MRI images into one of the four stages of Alzheimer's disease is used to compare three CMI architectures with Inception-V4 in terms of F1-score, training and testing times (related to power consumption), and memory usage (model size). Overall, simulations show that the new CMI models can outperform both the commonly used Inception-V4 and Inception-V4 using different activation functions. In the future, other "Compressed Multi-function CNNs", such as "Compressed Multi-function ResNets and DenseNets" that have a reduced number of convolutional blocks using different activation functions, will be developed to further increase classification accuracy, reduce training and testing times, reduce computational power, and reduce memory usage (model size) for building more effective healthcare systems, such as implementing accurate and convenient disease diagnosis systems on mobile devices that have limited battery power and memory.

  Access Model/Code and Paper
Multi-function Convolutional Neural Networks for Improving Image Classification Performance

May 30, 2018
Luna M. Zhang

Traditional Convolutional Neural Networks (CNNs) typically use the same activation function (usually ReLU) for all neurons with non-linear mapping operations. For example, the deep convolutional architecture Inception-v4 uses ReLU. To improve the classification performance of traditional CNNs, a new "Multi-function Convolutional Neural Network" (MCNN) is created by using different activation functions for different neurons. For $n$ neurons and $m$ different activation functions, there are a total of $m^n-m$ MCNNs and only $m$ traditional CNNs. Therefore, the best model is very likely to be chosen from MCNNs because there are $m^n-2m$ more MCNNs than traditional CNNs. For performance analysis, two different datasets for two applications (classifying handwritten digits from the MNIST database and classifying brain MRI images into one of the four stages of Alzheimer's disease (AD)) are used. For both applications, an activation function is randomly selected for each layer of a MCNN. For the AD diagnosis application, MCNNs using a newly created multi-function Inception-v4 architecture are constructed. Overall, simulations show that MCNNs can outperform traditional CNNs in terms of multi-class classification accuracy for both applications. An important future research work will be to efficiently select the best MCNN from $m^n-m$ candidate MCNNs. Current CNN software only provides users with partial functionality of MCNNs since different layers can use different activation functions but not individual neurons in the same layer. Thus, modifying current CNN software systems such as ResNets, DenseNets, and Dual Path Networks by using multiple activation functions and developing more effective and faster MCNN software systems and tools would be very useful to solve difficult practical image classification problems.

  Access Model/Code and Paper
Fast and Scalable Estimator for Sparse and Unit-Rank Higher-Order Regression Models

Nov 29, 2019
Jiaqi Zhang, Beilun Wang

Because tensor data appear more and more frequently in various scientific researches and real-world applications, analyzing the relationship between tensor features and the univariate outcome becomes an elementary task in many fields. To solve this task, we propose \underline{Fa}st \underline{S}parse \underline{T}ensor \underline{R}egression model (FasTR) based on so-called unit-rank CANDECOMP/PARAFAC decomposition. FasTR first decomposes the tensor coefficient into component vectors and then estimates each vector with $\ell_1$ regularized regression. Because of the independence of component vectors, FasTR is able to solve in a parallel way and the time complexity is proved to be superior to previous models. We evaluate the performance of FasTR on several simulated datasets and a real-world fMRI dataset. Experiment results show that, compared with four baseline models, in every case, FasTR can compute a better solution within less time.

* arXiv admin note: substantial text overlap with arXiv:1911.12965 

  Access Model/Code and Paper
Sparse and Low-Rank Tensor Regression via Parallel Proximal Method

Nov 29, 2019
Jiaqi Zhang, Beilun Wang

Motivated by applications in various scientific fields having demand of predicting relationship between higher-order (tensor) feature and univariate response, we propose a \underline{S}parse and \underline{L}ow-rank \underline{T}ensor \underline{R}egression model (SLTR). This model enforces sparsity and low-rankness of the tensor coefficient by directly applying $\ell_1$ norm and tensor nuclear norm on it respectively, such that (1) the structural information of tensor is preserved and (2) the data interpretation is convenient. To make the solving procedure scalable and efficient, SLTR makes use of the proximal gradient method to optimize two norm regularizers, which can be easily implemented parallelly. Additionally, a tighter convergence rate is proved over three-order tensor data. We evaluate SLTR on several simulated datasets and one fMRI dataset. Experiment results show that, compared with previous models, SLTR is able to obtain a solution no worse than others with much less time cost.

  Access Model/Code and Paper
Towards Robust Lung Segmentation in Chest Radiographs with Deep Learning

Nov 30, 2018
Jyoti Islam, Yanqing Zhang

Automated segmentation of Lungs plays a crucial role in the computer-aided diagnosis of chest X-Ray (CXR) images. Developing an efficient Lung segmentation model is challenging because of difficulties such as the presence of several edges at the rib cage and clavicle, inconsistent lung shape among different individuals, and the appearance of the lung apex. In this paper, we propose a robust model for Lung segmentation in Chest Radiographs. Our model learns to ignore the irrelevant regions in an input Chest Radiograph while highlighting regions useful for lung segmentation. The proposed model is evaluated on two public chest X-Ray datasets (Montgomery County, MD, USA, and Shenzhen No. 3 People's Hospital in China). The experimental result with a DICE score of 98.6% demonstrates the robustness of our proposed lung segmentation approach.

* Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:cs/0101200 

  Access Model/Code and Paper