Deep Multiple Description Coding by Learning Scalar Quantization

Nov 05, 2018

Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

In this paper, we propose a deep multiple description coding framework, whose quantizers are adaptively learned via the minimization of multiple description compressive loss. Firstly, our framework is built upon auto-encoder networks, which have multiple description multi-scale dilated encoder network and multiple description decoder networks. Secondly, two entropy estimation networks are learned to estimate the informative amounts of the quantized tensors, which can further supervise the learning of multiple description encoder network to represent the input image delicately. Thirdly, a pair of scalar quantizers accompanied by two importance-indicator maps is automatically learned in an end-to-end self-supervised way. Finally, multiple description structural dis-similarity distance loss is imposed on multiple description decoded images in pixel domain for diversified multiple description generations rather than on feature tensors in feature domain, in addition to multiple description reconstruction loss. Through testing on two commonly used datasets, it is verified that our method is beyond several state-of-the-art multiple description coding approaches in terms of coding efficiency.
Nov 05, 2018

Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

* 8 pages, 4 figures

**Click to Read Paper and Get Code**

Virtual Codec Supervised Re-Sampling Network for Image Compression

Jul 10, 2018

Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

In this paper, we propose an image re-sampling compression method by learning virtual codec network (VCN) to resolve the non-differentiable problem of quantization function for image compression. Here, the image re-sampling not only refers to image full-resolution re-sampling but also low-resolution re-sampling. We generalize this method for standard-compliant image compression (SCIC) framework and deep neural networks based compression (DNNC) framework. Specifically, an input image is measured by re-sampling network (RSN) network to get re-sampled vectors. Then, these vectors are directly quantized in the feature space in SCIC, or discrete cosine transform coefficients of these vectors are quantized to further improve coding efficiency in DNNC. At the encoder, the quantized vectors or coefficients are losslessly compressed by arithmetic coding. At the receiver, the decoded vectors are utilized to restore input image by image decoder network (IDN). In order to train RSN network and IDN network together in an end-to-end fashion, our VCN network intimates projection from the re-sampled vectors to the IDN-decoded image. As a result, gradients from IDN network to RSN network can be approximated by VCN network's gradient. Because dimension reduction can be further achieved by quantization in some dimensional space after image re-sampling within auto-encoder architecture, we can well initialize our networks from pre-trained auto-encoder networks. Through extensive experiments and analysis, it is verified that the proposed method has more effectiveness and versatility than many state-of-the-art approaches.
Jul 10, 2018

Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

* 13 pages, 11 figures Our project can be found in the website: https://github.com/VirtualCodecNetwork

**Click to Read Paper and Get Code**

Multiple Description Convolutional Neural Networks for Image Compression

Jan 20, 2018

Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

Multiple description coding (MDC) is able to stably transmit the signal in the un-reliable and non-prioritized networks, which has been broadly studied for several decades. However, the traditional MDC doesn't well leverage image's context features to generate multiple descriptions. In this paper, we propose a novel standard-compliant convolutional neural network-based MDC framework in term of image's context features. Firstly, multiple description generator network (MDGN) is designed to produce appearance-similar yet feature-different multiple descriptions automatically according to image's content, which are compressed by standard codec. Secondly, we present multiple description reconstruction network (MDRN) including side reconstruction network (SRN) and central reconstruction network (CRN). When any one of two lossy descriptions is received at the decoder, SRN network is used to improve the quality of this decoded lossy description by removing the compression artifact and up-sampling simultaneously. Meanwhile, we utilize CRN network with two decoded descriptions as inputs for better reconstruction, if both of lossy descriptions are available. Thirdly, multiple description virtual codec network (MDVCN) is proposed to bridge the gap between MDGN network and MDRN network in order to train an end-to-end MDC framework. Here, two learning algorithms are provided to train our whole framework. In addition to structural similarity loss function, the produced descriptions are used as opposing labels with multiple description distance loss function to regularize the training of MDGN network. These losses guarantee that the generated description images are structurally similar yet finely diverse. Experimental results show a great deal of objective and subjective quality measurements to validate the efficiency of the proposed method.
Jan 20, 2018

Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

* 13 pages, 3 tables, and 6 figures

**Click to Read Paper and Get Code**

Learning a Virtual Codec Based on Deep Convolutional Neural Network to Compress Image

Jan 16, 2018

Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

Although deep convolutional neural network has been proved to efficiently eliminate coding artifacts caused by the coarse quantization of traditional codec, it's difficult to train any neural network in front of the encoder for gradient's back-propagation. In this paper, we propose an end-to-end image compression framework based on convolutional neural network to resolve the problem of non-differentiability of the quantization function in the standard codec. First, the feature description neural network is used to get a valid description in the low-dimension space with respect to the ground-truth image so that the amount of image data is greatly reduced for storage or transmission. After image's valid description, standard image codec such as JPEG is leveraged to further compress image, which leads to image's great distortion and compression artifacts, especially blocking artifacts, detail missing, blurring, and ringing artifacts. Then, we use a post-processing neural network to remove these artifacts. Due to the challenge of directly learning a non-linear function for a standard codec based on convolutional neural network, we propose to learn a virtual codec neural network to approximate the projection from the valid description image to the post-processed compressed image, so that the gradient could be efficiently back-propagated from the post-processing neural network to the feature description neural network during training. Meanwhile, an advanced learning algorithm is proposed to train our deep neural networks for compression. Obviously, the priority of the proposed method is compatible with standard existing codecs and our learning strategy can be easily extended into these codecs based on convolutional neural network. Experimental results have demonstrated the advances of the proposed method as compared to several state-of-the-art approaches, especially at very low bit-rate.
Jan 16, 2018

Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

* 11 pages, 7 figures

**Click to Read Paper and Get Code**

Mixed-Resolution Image Representation and Compression with Convolutional Neural Networks

Aug 01, 2018

Lijun Zhao, Huihui Bai, Feng Li, Anhong Wang, Yao Zhao

In this paper, we propose an end-to-end mixed-resolution image compression framework with convolutional neural networks. Firstly, given one input image, feature description neural network (FDNN) is used to generate a new representation of this image, so that this image representation can be more efficiently compressed by standard codec, as compared to the input image. Furthermore, we use post-processing neural network (PPNN) to remove the coding artifacts caused by quantization of codec. Secondly, low-resolution image representation is adopted for high efficiency compression in terms of most of bit spent by image's structures under low bit-rate. However, more bits should be assigned to image details in the high-resolution, when most of structures have been kept after compression at the high bit-rate. This comes from a fact that the low-resolution image representation can't burden more information than high-resolution representation beyond a certain bit-rate. Finally, to resolve the problem of error back-propagation from the PPNN network to the FDNN network, we introduce to learn a virtual codec neural network to imitate two continuous procedures of standard compression and post-processing. The objective experimental results have demonstrated the proposed method has a large margin improvement, when comparing with several state-of-the-art approaches.
Aug 01, 2018

Lijun Zhao, Huihui Bai, Feng Li, Anhong Wang, Yao Zhao

* 5 pages, and 2 figures. arXiv admin note: substantial text overlap with arXiv:1712.05969

**Click to Read Paper and Get Code**

Local Activity-tuned Image Filtering for Noise Removal and Image Smoothing

Nov 18, 2017

Lijun Zhao, Jie Liang, Huihui Bai, Lili Meng, Anhong Wang, Yao Zhao

In this paper, two local activity-tuned filtering frameworks are proposed for noise removal and image smoothing, where the local activity measurement is given by the clipped and normalized local variance or standard deviation. The first framework is a modified anisotropic diffusion for noise removal of piece-wise smooth image. The second framework is a local activity-tuned Relative Total Variation (LAT-RTV) method for image smoothing. Both frameworks employ the division of gradient and the local activity measurement to achieve noise removal. In addition, to better capture local information, the proposed LAT-RTV uses the product of gradient and local activity measurement to boost the performance of image smoothing. Experimental results are presented to demonstrate the efficiency of the proposed methods on various applications, including depth image filtering, clip-art compression artifact removal, image smoothing, and image denoising.
Nov 18, 2017

Lijun Zhao, Jie Liang, Huihui Bai, Lili Meng, Anhong Wang, Yao Zhao

* 13 papers, 9 figures

**Click to Read Paper and Get Code**

Simultaneously Color-Depth Super-Resolution with Conditional Generative Adversarial Network

Nov 08, 2017

Lijun Zhao, Huihui Bai, Jie Liang, Bing Zeng, Anhong Wang, Yao Zhao

Recently, Generative Adversarial Network (GAN) has been found wide applications in style transfer, image-to-image translation and image super-resolution. In this paper, a color-depth conditional GAN is proposed to concurrently resolve the problems of depth super-resolution and color super-resolution in 3D videos. Firstly, given the low-resolution depth image and low-resolution color image, a generative network is proposed to leverage mutual information of color image and depth image to enhance each other in consideration of the geometry structural dependency of color-depth image in the same scene. Secondly, three loss functions, including data loss, total variation loss, and 8-connected gradient difference loss are introduced to train this generative network in order to keep generated images close to the real ones, in addition to the adversarial loss. Experimental results demonstrate that the proposed approach produces high-quality color image and depth image from low-quality image pair, and it is superior to several other leading methods. Besides, we use the same neural network framework to resolve the problem of image smoothing and edge detection at the same time.
Nov 08, 2017

Lijun Zhao, Huihui Bai, Jie Liang, Bing Zeng, Anhong Wang, Yao Zhao

* 11 pages, 10 figures

**Click to Read Paper and Get Code**

On the ERM Principle with Networked Data

Nov 22, 2017

Yuanhong Wang, Yuyi Wang, Xingwu Liu, Juhua Pu

Networked data, in which every training example involves two objects and may share some common objects with others, is used in many machine learning tasks such as learning to rank and link prediction. A challenge of learning from networked examples is that target values are not known for some pairs of objects. In this case, neither the classical i.i.d.\ assumption nor techniques based on complete U-statistics can be used. Most existing theoretical results of this problem only deal with the classical empirical risk minimization (ERM) principle that always weights every example equally, but this strategy leads to unsatisfactory bounds. We consider general weighted ERM and show new universal risk bounds for this problem. These new bounds naturally define an optimization problem which leads to appropriate weights for networked examples. Though this optimization problem is not convex in general, we devise a new fully polynomial-time approximation scheme (FPTAS) to solve it.
Nov 22, 2017

Yuanhong Wang, Yuyi Wang, Xingwu Liu, Juhua Pu

* accepted by AAAI. arXiv admin note: substantial text overlap with arXiv:math/0702683 by other authors

**Click to Read Paper and Get Code**

End-to-End Learning of Multi-scale Convolutional Neural Network for Stereo Matching

Jun 25, 2019

Li Zhang, Quanhong Wang, Haihua Lu, Yong Zhao

Deep neural networks have shown excellent performance in stereo matching task. Recently CNN-based methods have shown that stereo matching can be formulated as a supervised learning task. However, less attention is paid on the fusion of contextual semantic information and details. To tackle this problem, we propose a network for disparity estimation based on abundant contextual details and semantic information, called Multi-scale Features Network (MSFNet). First, we design a new structure to encode rich semantic information and fine-grained details by fusing multi-scale features. And we combine the advantages of element-wise addition and concatenation, which is conducive to merge semantic information with details. Second, a guidance mechanism is introduced to guide the network to automatically focus more on the unreliable regions. Third, we formulate the consistency check as an error map, obtained by the low stage features with fine-grained details. Finally, we adopt the consistency checking between the left feature and the synthetic left feature to refine the initial disparity. Experiments on Scene Flow and KITTI 2015 benchmark demonstrated that the proposed method can achieve the state-of-the-art performance.
Jun 25, 2019

Li Zhang, Quanhong Wang, Haihua Lu, Yong Zhao

* 16 pages, 4 figures,Asian Conference on Machine Learning(ACML)2018

**Click to Read Paper and Get Code**

Quadratic Upper Bound for Recursive Teaching Dimension of Finite VC Classes

Feb 18, 2017

Lunjia Hu, Ruihan Wu, Tianhong Li, Liwei Wang

In this work we study the quantitative relation between the recursive teaching dimension (RTD) and the VC dimension (VCD) of concept classes of finite sizes. The RTD of a concept class $\mathcal C \subseteq \{0, 1\}^n$, introduced by Zilles et al. (2011), is a combinatorial complexity measure characterized by the worst-case number of examples necessary to identify a concept in $\mathcal C$ according to the recursive teaching model. For any finite concept class $\mathcal C \subseteq \{0,1\}^n$ with $\mathrm{VCD}(\mathcal C)=d$, Simon & Zilles (2015) posed an open problem $\mathrm{RTD}(\mathcal C) = O(d)$, i.e., is RTD linearly upper bounded by VCD? Previously, the best known result is an exponential upper bound $\mathrm{RTD}(\mathcal C) = O(d \cdot 2^d)$, due to Chen et al. (2016). In this paper, we show a quadratic upper bound: $\mathrm{RTD}(\mathcal C) = O(d^2)$, much closer to an answer to the open problem. We also discuss the challenges in fully solving the problem.
Feb 18, 2017

Lunjia Hu, Ruihan Wu, Tianhong Li, Liwei Wang

**Click to Read Paper and Get Code**

Spatio-temporal Aware Non-negative Component Representation for Action Recognition

Aug 27, 2016

Jianhong Wang, Tian Lan, Xu Zhang, Limin Luo

This paper presents a novel mid-level representation for action recognition, named spatio-temporal aware non-negative component representation (STANNCR). The proposed STANNCR is based on action component and incorporates the spatial-temporal information. We first introduce a spatial-temporal distribution vector (STDV) to model the distributions of local feature locations in a compact and discriminative manner. Then we employ non-negative matrix factorization (NMF) to learn the action components and encode the video samples. The action component considers the correlations of visual words, which effectively bridge the sematic gap in action recognition. To incorporate the spatial-temporal cues for final representation, the STDV is used as the part of graph regularization for NMF. The fusion of spatial-temporal information makes the STANNCR more discriminative, and our fusion manner is more compact than traditional method of concatenating vectors. The proposed approach is extensively evaluated on three public datasets. The experimental results demonstrate the effectiveness of STANNCR for action recognition.
Aug 27, 2016

Jianhong Wang, Tian Lan, Xu Zhang, Limin Luo

* 11 pages, 5 figures, 6 tables

**Click to Read Paper and Get Code**

Rethink Global Reward Game and Credit Assignment in Multi-agent Reinforcement Learning

Jul 11, 2019

Jianhong Wang, Yuan Zhang, Tae-Kyun Kim, Yunjie Gu

Cooperative game is a critical research area in multi-agent reinforcement learning (MARL). Global reward game is a subclass of cooperative games, where all agents aim to maximize cumulative global rewards. Credit assignment is an important problem studied in the global reward game. Most works stand by the view of non-cooperative-game theoretical framework with the shared reward approach, i.e., each agent is assigned a shared global reward directly. This, however, may give each agent an inaccurate feedback on his contribution to the group. In this paper, we introduce a cooperative-game theoretical framework and extend it to the finite-horizon case. We show that our proposed framework is a superset of the global reward game. Based on this framework, we propose an algorithm called Shapley Q-value policy gradient (SQPG) to learn a local reward approach that can distribute the cumulative global reward fairly, reflecting each agent's own contribution in contrast to the shared reward approach. We evaluate our method on the Cooperative Navigation, Prey-and-Predator and Traffic Junction, compared with MADDPG, COMA, Independent actor-critic and Independent DDPG. In the experiments, our algorithm shows better convergence than the baselines.
Jul 11, 2019

Jianhong Wang, Yuan Zhang, Tae-Kyun Kim, Yunjie Gu

**Click to Read Paper and Get Code**

Quantum Speedup in Adaptive Boosting of Binary Classification

Feb 03, 2019

Ximing Wang, Yuechi Ma, Min-Hsiu Hsieh, Manhong Yung

In classical machine learning, a set of weak classifiers can be adaptively combined to form a strong classifier for improving the overall performance, a technique called adaptive boosting (or AdaBoost). However, constructing the strong classifier for a large data set is typically resource consuming. Here we propose a quantum extension of AdaBoost, demonstrating a quantum algorithm that can output the optimal strong classifier with a quadratic speedup in the number of queries of the weak classifiers. Our results also include a generalization of the standard AdaBoost to the cases where the output of each classifier may be probabilistic even for the same input. We prove that the update rules and the query complexity of the non-deterministic classifiers are the same as those of deterministic classifiers, which may be of independent interest to the classical machine-learning community. Furthermore, the AdaBoost algorithm can also be applied to data encoded in the form of quantum states; we show how the training set can be simplified by using the tools of t-design. Our approach describes a model of quantum machine learning where quantum speedup is achieved in finding the optimal classifier, which can then be applied for classical machine-learning applications.
Feb 03, 2019

Ximing Wang, Yuechi Ma, Min-Hsiu Hsieh, Manhong Yung

**Click to Read Paper and Get Code**

Thermostat-assisted continuously-tempered Hamiltonian Monte Carlo for Bayesian learning

May 28, 2018

Rui Luo, Yaodong Yang, Jianhong Wang, Zhanxing Zhu, Jun Wang

In this paper, we propose a novel sampling method, the thermostat-assisted continuously-tempered Hamiltonian Monte Carlo, for the purpose of multimodal Bayesian learning. It simulates a noisy dynamical system by incorporating both a continuously-varying tempering variable and the Nos\'e-Hoover thermostats. A significant benefit is that it is not only able to efficiently generate i.i.d. samples when the underlying posterior distributions are multimodal, but also capable of adaptively neutralising the noise arising from the use of mini-batches. While the properties of the approach have been studied using synthetic datasets, our experiments on three real datasets have also shown its performance gains over several strong baselines for Bayesian learning with various types of neural networks plunged in.
May 28, 2018

Rui Luo, Yaodong Yang, Jianhong Wang, Zhanxing Zhu, Jun Wang

**Click to Read Paper and Get Code**

Individual Recognition in Schizophrenia using Deep Learning Methods with Random Forest and Voting Classifiers: Insights from Resting State EEG Streams

Jan 17, 2018

Lei Chu, Robert Qiu, Haichun Liu, Zenan Ling, Tianhong Zhang, Jijun Wang

Recently, there has been a growing interest in monitoring brain activity for individual recognition system. So far these works are mainly focussing on single channel data or fragment data collected by some advanced brain monitoring modalities. In this study we propose new individual recognition schemes based on spatio-temporal resting state Electroencephalography (EEG) data. Besides, instead of using features derived from artificially-designed procedures, modified deep learning architectures which aim to automatically extract an individual's unique features are developed to conduct classification. Our designed deep learning frameworks are proved of a small but consistent advantage of replacing the $softmax$ layer with Random Forest. Additionally, a voting layer is added at the top of designed neural networks in order to tackle the classification problem arisen from EEG streams. Lastly, various experiments are implemented to evaluate the performance of the designed deep learning architectures; Results indicate that the proposed EEG-based individual recognition scheme yields a high degree of classification accuracy: $81.6\%$ for characteristics in high risk (CHR) individuals, $96.7\%$ for clinically stable first episode patients with schizophrenia (FES) and $99.2\%$ for healthy controls (HC).
Jan 17, 2018

Lei Chu, Robert Qiu, Haichun Liu, Zenan Ling, Tianhong Zhang, Jijun Wang

**Click to Read Paper and Get Code**