Models, code, and papers for "Jiewei Lu":
Automated pavement crack detection is a challenging task that has been researched for decades due to the complicated pavement conditions in real world. In this paper, a supervised method based on deep learning is proposed, which has the capability of dealing with different pavement conditions. Specifically, a convolutional neural network (CNN) is used to learn the structure of the cracks from raw images, without any preprocessing. Small patches are extracted from crack images as inputs to generate a large training database, a CNN is trained and crack detection is modeled as a multi-label classification problem. Typically, crack pixels are much fewer than non-crack pixels. To deal with the problem with severely imbalanced data, a strategy with modifying the ratio of positive to negative samples is proposed. The method is tested on two public databases and compared with five existing methods. Experimental results show that it outperforms the other methods.
Automated steel bar counting and center localization plays an important role in the factory automation of steel bars. Traditional methods only focus on steel bar counting and their performances are often limited by complex industrial environments. Convolutional neural network (CNN), which has great capability to deal with complex tasks in challenging environments, is applied in this work. A framework called CNN-DC is proposed to achieve automated steel bar counting and center localization simultaneously. The proposed framework CNN-DC first detects the candidate center points with a deep CNN. Then an effective clustering algorithm named as Distance Clustering(DC) is proposed to cluster the candidate center points and locate the true centers of steel bars. The proposed CNN-DC can achieve 99.26% accuracy for steel bar counting and 4.1% center offset for center localization on the established steel bar dataset, which demonstrates that the proposed CNN-DC can perform well on automated steel bar counting and center localization. Code is made publicly available at: https://github.com/BenzhangQiu/Steel-bar-Detection.
In this paper, a hierarchical image matting model is proposed to extract blood vessels from fundus images. More specifically, a hierarchical strategy utilizing the continuity and extendibility of retinal blood vessels is integrated into the image matting model for blood vessel segmentation. Normally the matting models require the user specified trimap, which separates the input image into three regions manually: the foreground, background and unknown regions. However, since creating a user specified trimap is a tedious and time-consuming task, region features of blood vessels are used to generate the trimap automatically in this paper. The proposed model has low computational complexity and outperforms many other state-ofart supervised and unsupervised methods in terms of accuracy, which achieves a vessel segmentation accuracy of 96:0%, 95:7% and 95:1% in an average time of 10:72s, 15:74s and 50:71s on images from three publicly available fundus image datasets DRIVE, STARE, and CHASE DB1, respectively.
Strabismus is one of the most influential ophthalmologic diseases in humans life. Timely detection of strabismus contributes to its prognosis and treatment. Telemedicine, which has great potential to alleviate the growing demand of the diagnosis of ophthalmologic diseases, is an effective method to achieve timely strabismus detection. In addition, deep neural networks are beneficial to achieve fully automated strabismus detection. In this paper, a tele strabismus dataset is founded by the ophthalmologists. Then a new algorithm based on deep neural networks is proposed to achieve automated strabismus detection on the founded tele strabismus dataset. The proposed algorithm consists of two stages. In the first stage, R-FCN is applied to perform eye region segmentation. In the second stage, a deep convolutional neural networks is built and trained in order to classify the segmented eye regions as strabismus or normal. The experimental results on the founded tele strabismus dataset shows that the proposed method can have a good performance on automated strabismus detection for telemedicine application. Code is made publicly available at: https://github.com/jieWeiLu/Strabismus-Detection-for-Telemedicine-Application
Instance retrieval requires one to search for images that contain a particular object within a large corpus. Recent studies show that using image features generated by pooling convolutional layer feature maps (CFMs) of a pretrained convolutional neural network (CNN) leads to promising performance for this task. However, due to the global pooling strategy adopted in those works, the generated image feature is less robust to image clutter and tends to be contaminated by the irrelevant image patterns. In this article, we alleviate this drawback by proposing a novel reranking algorithm using CFMs to refine the retrieval result obtained by existing methods. Our key idea, called query adaptive matching (QAM), is to first represent the CFMs of each image by a set of base regions which can be freely combined into larger regions-of-interest. Then the similarity between the query and a candidate image is measured by the best similarity score that can be attained by comparing the query feature and the feature pooled from a combined region. We show that the above procedure can be cast as an optimization problem and it can be solved efficiently with an off-the-shelf solver. Besides this general framework, we also propose two practical ways to create the base regions. One is based on the property of the CFM and the other one is based on a multi-scale spatial pyramid scheme. Through extensive experiments, we show that our reranking approaches bring substantial performance improvement and by applying them we can outperform the state of the art on several instance retrieval benchmarks.
One of the primary challenges faced by deep learning is the degree to which current methods exploit superficial statistics and dataset bias, rather than learning to generalise over the specific representations they have experienced. This is a critical concern because generalisation enables robust reasoning over unseen data, whereas leveraging superficial statistics is fragile to even small changes in data distribution. To illuminate the issue and drive progress towards a solution, we propose a test that explicitly evaluates abstract reasoning over visual data. We introduce a large-scale benchmark of visual questions that involve operations fundamental to many high-level vision tasks, such as comparisons of counts and logical operations on complex visual properties. The benchmark directly measures a method's ability to infer high-level relationships and to generalise them over image-based concepts. It includes multiple training/test splits that require controlled levels of generalization. We evaluate a range of deep learning architectures, and find that existing models, including those popular for vision-and-language tasks, are unable to solve seemingly-simple instances. Models using relational networks fare better but leave substantial room for improvement.