Research papers and code for "Xiaolei Shen":
As a basic task in computer vision, semantic segmentation can provide fundamental information for object detection and instance segmentation to help the artificial intelligence better understand real world. Since the proposal of fully convolutional neural network (FCNN), it has been widely used in semantic segmentation because of its high accuracy of pixel-wise classification as well as high precision of localization. In this paper, we apply several famous FCNN to brain tumor segmentation, making comparisons and adjusting network architectures to achieve better performance measured by metrics such as precision, recall, mean of intersection of union (mIoU) and dice score coefficient (DSC). The adjustments to the classic FCNN include adding more connections between convolutional layers, enlarging decoders after up sample layers and changing the way shallower layers' information is reused. Besides the structure modification, we also propose a new classifier with a hierarchical dice loss. Inspired by the containing relationship between classes, the loss function converts multiple classification to multiple binary classification in order to counteract the negative effect caused by imbalance data set. Massive experiments have been done on the training set and testing set in order to assess our refined fully convolutional neural networks and new types of loss function. Competitive figures prove they are more effective than their predecessors.

* 14 pages, 7 figures, 6 tables
Click to Read Paper and Get Code
In this paper, we present a new automatic diagnosis method of facial acne vulgaris based on convolutional neural network. This method is proposed to overcome the shortcoming of classification types in previous methods. The core of our method is to extract features of images based on convolutional neural network and achieve classification by classifier. We design a binary classifier of skin-and-non-skin to detect skin area and a seven-classifier to achieve the classification of facial acne vulgaris and healthy skin. In the experiment, we compared the effectiveness of our convolutional neural network and the pre-trained VGG16 neural network on the ImageNet dataset. And we use the ROC curve and normal confusion matrix to evaluate the performance of the binary classifier and the seven-classifier. The results of our experiment show that the pre-trained VGG16 neural network is more effective in extracting image features. The classifiers based on the pre-trained VGG16 neural network achieve the skin detection and acne classification and have good robustness.

* 12 pages, 7 figures, 5 tables
Click to Read Paper and Get Code
This is the preprint version of our paper on ICONIP. Outdoor augmented reality geographic information system (ARGIS) is the hot application of augmented reality over recent years. This paper concludes the key solutions of ARGIS, designs the mobile augmented reality pipeline prospect system (ARPPS), and respectively realizes the machine vision based pipeline prospect system (MVBPPS) and the sensor based pipeline prospect system (SBPPS). With the MVBPPS's realization, this paper studies the neural network based 3D features matching method.

* This is the preprint version of our paper on ICONIP
Click to Read Paper and Get Code
Although Generative Adversarial Networks (GANs) have shown remarkable success in various tasks, they still face challenges in generating high quality images. In this paper, we propose Stacked Generative Adversarial Networks (StackGAN) aiming at generating high-resolution photo-realistic images. First, we propose a two-stage generative adversarial network architecture, StackGAN-v1, for text-to-image synthesis. The Stage-I GAN sketches the primitive shape and colors of the object based on given text description, yielding low-resolution images. The Stage-II GAN takes Stage-I results and text descriptions as inputs, and generates high-resolution images with photo-realistic details. Second, an advanced multi-stage generative adversarial network architecture, StackGAN-v2, is proposed for both conditional and unconditional generative tasks. Our StackGAN-v2 consists of multiple generators and discriminators in a tree-like structure; images at multiple scales corresponding to the same scene are generated from different branches of the tree. StackGAN-v2 shows more stable training behavior than StackGAN-v1 by jointly approximating multiple distributions. Extensive experiments demonstrate that the proposed stacked generative adversarial networks significantly outperform other state-of-the-art methods in generating photo-realistic images.

* In IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), 2018. (16 pages, 15 figures.)
Click to Read Paper and Get Code
Synthesizing high-quality images from text descriptions is a challenging problem in computer vision and has many practical applications. Samples generated by existing text-to-image approaches can roughly reflect the meaning of the given descriptions, but they fail to contain necessary details and vivid object parts. In this paper, we propose Stacked Generative Adversarial Networks (StackGAN) to generate 256x256 photo-realistic images conditioned on text descriptions. We decompose the hard problem into more manageable sub-problems through a sketch-refinement process. The Stage-I GAN sketches the primitive shape and colors of the object based on the given text description, yielding Stage-I low-resolution images. The Stage-II GAN takes Stage-I results and text descriptions as inputs, and generates high-resolution images with photo-realistic details. It is able to rectify defects in Stage-I results and add compelling details with the refinement process. To improve the diversity of the synthesized images and stabilize the training of the conditional-GAN, we introduce a novel Conditioning Augmentation technique that encourages smoothness in the latent conditioning manifold. Extensive experiments and comparisons with state-of-the-arts on benchmark datasets demonstrate that the proposed method achieves significant improvements on generating photo-realistic images conditioned on text descriptions.

* ICCV 2017 Oral Presentation
Click to Read Paper and Get Code