Dynamic spectrum access (DSA) is regarded as an effective and efficient technology to share radio spectrum among different networks. As a secondary user (SU), a DSA device will face two critical problems: avoiding causing harmful interference to primary users (PUs), and conducting effective interference coordination with other secondary users. These two problems become even more challenging for a distributed DSA network where there is no centralized controllers for SUs. In this paper, we investigate communication strategies of a distributive DSA network under the presence of spectrum sensing errors. To be specific, we apply the powerful machine learning tool, deep reinforcement learning (DRL), for SUs to learn "appropriate" spectrum access strategies in a distributed fashion assuming NO knowledge of the underlying system statistics. Furthermore, a special type of recurrent neural network (RNN), called the reservoir computing (RC), is utilized to realize DRL by taking advantage of the underlying temporal correlation of the DSA network. Using the introduced machine learning-based strategy, SUs could make spectrum access decisions distributedly relying only on their own current and past spectrum sensing outcomes. Through extensive experiments, our results suggest that the RC-based spectrum access strategy can help the SU to significantly reduce the chances of collision with PUs and other SUs. We also show that our scheme outperforms the myopic method which assumes the knowledge of system statistics, and converges faster than the Q-learning method when the number of channels is large.

* This work is accepted in IEEE IoT Journal 2018
Click to Read Paper
Current studies about motor imagery based rehabilitation training systems for stroke subjects lack an appropriate analytic method, which can achieve a considerable classification accuracy, at the same time detects gradual changes of imagery patterns during rehabilitation process and disinters potential mechanisms about motor function recovery. In this study, we propose an adaptive boosting algorithm based on the cortex plasticity and spectral band shifts. This approach models the usually predetermined spatial-spectral configurations in EEG study into variable preconditions, and introduces a new heuristic of stochastic gradient boost for training base learners under these preconditions. We compare our proposed algorithm with commonly used methods on datasets collected from 2 months' clinical experiments. The simulation results demonstrate the effectiveness of the method in detecting the variations of stroke patients' EEG patterns. By chronologically reorganizing the weight parameters of the learned additive model, we verify the spatial compensatory mechanism on impaired cortex and detect the changes of accentuation bands in spectral domain, which may contribute important prior knowledge for rehabilitation practice.

* 10 pages,3 figures
Click to Read Paper
We advocate the use of implicit fields for learning generative models of shapes and introduce an implicit field decoder for shape generation, aimed at improving the visual quality of the generated shapes. An implicit field assigns a value to each point in 3D space, so that a shape can be extracted as an iso-surface. Our implicit field decoder is trained to perform this assignment by means of a binary classifier. Specifically, it takes a point coordinate, along with a feature vector encoding a shape, and outputs a value which indicates whether the point is outside the shape or not. By replacing conventional decoders by our decoder for representation learning and generative modeling of shapes, we demonstrate superior results for tasks such as shape autoencoding, generation, interpolation, and single-view 3D reconstruction, particularly in terms of visual quality. Code and supplementary material are available at https://github.com/czq142857/implicit-decoder.

* Code: https://github.com/czq142857/implicit-decoder Project page: https://www.sfu.ca/~zhiqinc/imgan/Readme.html
Click to Read Paper
The developments of deep neural networks (DNN) in recent years have ushered a brand new era of artificial intelligence. DNNs are proved to be excellent in solving very complex problems, e.g., visual recognition and text understanding, to the extent of competing with or even surpassing people. Despite inspiring and encouraging success of DNNs, thorough theoretical analyses still lack to unravel the mystery of their magics. The design of DNN structure is dominated by empirical results in terms of network depth, number of neurons and activations. A few of remarkable works published recently in an attempt to interpret DNNs have established the first glimpses of their internal mechanisms. Nevertheless, research on exploring how DNNs operate is still at the initial stage with plenty of room for refinement. In this paper, we extend precedent research on neural networks with piecewise linear activations (PLNN) concerning linear regions bounds. We present (i) the exact maximal number of linear regions for single layer PLNNs; (ii) a upper bound for multi-layer PLNNs; and (iii) a tighter upper bound for the maximal number of liner regions on rectifier networks. The derived bounds also indirectly explain why deep models are more powerful than shallow counterparts, and how non-linearity of activation functions impacts on expressiveness of networks.

* Counting linear regions of neural networks
Click to Read Paper
Semantic role theory is a widely used approach for event representation. Yet, there are multiple indications that semantic role paradigm is necessary but not sufficient to cover all elements of event structure. We conducted an analysis of semantic role representation for events to provide an empirical evidence of insufficiency. The consequence of that is a hybrid role-scalar approach. The results are considered as preliminary in investigation of semantic roles coverage for event representation.

* 5 pages, 1 table, 1 figure
Click to Read Paper
In most convolution neural networks (CNNs), downsampling hidden layers is adopted for increasing computation efficiency and the receptive field size. Such operation is commonly so-called pooling. Maximation and averaging over sliding windows (max/average pooling), and plain downsampling in the form of strided convolution are popular pooling methods. Since the pooling is a lossy procedure, a motivation of our work is to design a new pooling approach for less lossy in the dimensionality reduction. Inspired by the Fourier spectral pooling(FSP) proposed by Rippel et. al. [1], we present the Hartley transform based spectral pooling method in CNNs. Compared with FSP, the proposed spectral pooling avoids the use of complex arithmetic for frequency representation and reduces the computation. Spectral pooling preserves more structure features for network's discriminability than max and average pooling. We empirically show that Hartley spectral pooling gives rise to the convergence of training CNNs on MNIST and CIFAR-10 datasets.

* 5 pages, 6 figures, letter
Click to Read Paper
We present Semantic WordRank (SWR), an unsupervised method for generating an extractive summary of a single document. Built on a weighted word graph with semantic and co-occurrence edges, SWR scores sentences using an article-structure-biased PageRank algorithm with a Softplus function adjustment, and promotes topic diversity using spectral subtopic clustering under the Word-Movers-Distance metric. We evaluate SWR on the DUC-02 and SummBank datasets and show that SWR produces better summaries than the state-of-the-art algorithms over DUC-02 under common ROUGE measures. We then show that, under the same measures over SummBank, SWR outperforms each of the three human annotators (aka. judges) and compares favorably with the combined performance of all judges.

* 12 pages, accepted by IDEAL2018
Click to Read Paper
In this paper, we derive a temporal arbitrage policy for storage via reinforcement learning. Real-time price arbitrage is an important source of revenue for storage units, but designing good strategies have proven to be difficult because of the highly uncertain nature of the prices. Instead of current model predictive or dynamic programming approaches, we use reinforcement learning to design an optimal arbitrage policy. This policy is learned through repeated charge and discharge actions performed by the storage unit through updating a value matrix. We design a reward function that does not only reflect the instant profit of charge/discharge decisions but also incorporate the history information. Simulation results demonstrate that our designed reward function leads to significant performance improvement compared with existing algorithms.

* 2018 IEEE PES General Meeting (GM)
Click to Read Paper
Task selection (picking an appropriate labeling task) and worker selection (assigning the labeling task to a suitable worker) are two major challenges in task assignment for crowdsourcing. Recently, worker selection has been successfully addressed by the bandit-based task assignment (BBTA) method, while task selection has not been thoroughly investigated yet. In this paper, we experimentally compare several task selection strategies borrowed from active learning literature, and show that the least confidence strategy significantly improves the performance of task assignment in crowdsourcing.

* arXiv admin note: substantial text overlap with arXiv:1507.05800
Click to Read Paper
Visual and audio modalities are two symbiotic modalities underlying videos, which contain both common and complementary information. If they can be mined and fused sufficiently, performances of related video tasks can be significantly enhanced. However, due to the environmental interference or sensor fault, sometimes, only one modality exists while the other is abandoned or missing. By recovering the missing modality from the existing one based on the common information shared between them and the prior information of the specific modality, great bonus will be gained for various vision tasks. In this paper, we propose a Cross-Modal Cycle Generative Adversarial Network (CMCGAN) to handle cross-modal visual-audio mutual generation. Specifically, CMCGAN is composed of four kinds of subnetworks: audio-to-visual, visual-to-audio, audio-to-audio and visual-to-visual subnetworks respectively, which are organized in a cycle architecture. CMCGAN has several remarkable advantages. Firstly, CMCGAN unifies visual-audio mutual generation into a common framework by a joint corresponding adversarial loss. Secondly, through introducing a latent vector with Gaussian distribution, CMCGAN can handle dimension and structure asymmetry over visual and audio modalities effectively. Thirdly, CMCGAN can be trained end-to-end to achieve better convenience. Benefiting from CMCGAN, we develop a dynamic multimodal classification network to handle the modality missing problem. Abundant experiments have been conducted and validate that CMCGAN obtains the state-of-the-art cross-modal visual-audio generation results. Furthermore, it is shown that the generated modality achieves comparable effects with those of original modality, which demonstrates the effectiveness and advantages of our proposed method.

* Have some problems need to be handled
Click to Read Paper
Traditionally, the P3P problem is solved by firstly transforming its 3 quadratic equations into a quartic one, then by locating the roots of the resulting quartic equation and verifying whether a root does really correspond to a true solution of the P3P problem itself. However, a root of the quartic equation does not always correspond to a solution of the P3P problem. In this work, we show that when the optical center is outside of all the 6 toroids defined by the control point triangle, each positive root of the Grunert's quartic equation must correspond to a true solution of the P3P problem, and the corresponding P3P problem cannot have a unique solution, it must have either 2 positive solutions or 4 positive solutions. In addition, we show that when the optical center passes through any one of the 3 toroids among these 6 toroids ( except possibly for two concentric circles) , the number of the solutions of the corresponding P3P problem always changes by 1, either increased by 1 or decreased by 1.Furthermore we show that such changed solutions always locate in a small neighborhood of control points, hence the 3 toroids are critical surfaces of the P3P problem and the 3 control points are 3 singular points of solutions. A notable example is that when the optical center passes through the outer surface of the union of the 6 toroids from the outside to inside, the number of the solutions must always decrease by 1. Our results are the first to give an explicit and geometrically intuitive relationship between the P3P solutions and the roots of its quartic equation. It could act as some theoretical guidance for P3P practitioners to properly arrange their control points to avoid undesirable solutions.

Click to Read Paper
It is well known that the P3P problem could have 1, 2, 3 and at most 4 positive solutions under different configurations among its 3 control points and the position of the optical center. Since in any real applications, the knowledge on the exact number of possible solutions is a prerequisite for selecting the right one among all the possible solutions, the study on the phenomenon of multiple solutions in the P3P problem has been an active topic . In this work, we provide some new geometric interpretations on the multi-solution phenomenon in the P3P problem, our main results include: (1): The necessary and sufficient condition for the P3P problem to have a pair of side-sharing solutions is the two optical centers of the solutions both lie on one of the 3 vertical planes to the base plane of control points; (2): The necessary and sufficient condition for the P3P problem to have a pair of point-sharing solutions is the two optical centers of the solutions both lie on one of the 3 so-called skewed danger cylinders;(3): If the P3P problem has other solutions in addition to a pair of side-sharing ( point-sharing) solutions, these remaining solutions must be a point-sharing ( side-sharing ) pair. In a sense, the side-sharing pair and the point-sharing pair are companion pairs. In sum, our results provide some new insights into the nature of the multi-solution phenomenon in the P3P problem, in addition to their academic value, they could also be used as some theoretical guidance for practitioners in real applications to avoid occurrence of multiple solutions by properly arranging the control points.

Click to Read Paper
In this paper, we propose an actor ensemble algorithm, named ACE, for continuous control with a deterministic policy in reinforcement learning. In ACE, we use actor ensemble (i.e., multiple actors) to search the global maxima of the critic. Besides the ensemble perspective, we also formulate ACE in the option framework by extending the option-critic architecture with deterministic intra-option policies, revealing a relationship between ensemble and options. Furthermore, we perform a look-ahead tree search with those actors and a learned value prediction model, resulting in a refined value estimation. We demonstrate a significant performance boost of ACE over DDPG and its variants in challenging physical robot simulators.

* AAAI 2019
Click to Read Paper
Semantic role theory considers roles as a small universal set of unanalyzed entities. It means that formally there are no restrictions on role combinations. We argue that the semantic roles co-occur in verb representations. It means that there are hidden restrictions on role combinations. To demonstrate that a practical and evidence-based approach has been built on in-depth analysis of the largest verb database VerbNet. The consequences of this approach are considered.

* 5 pages, 1 figure, 3 tables
Click to Read Paper
Recently, multiagent deep reinforcement learning (DRL) has received increasingly wide attention. Existing multiagent DRL algorithms are inefficient when facing with the non-stationarity due to agents update their policies simultaneously in stochastic cooperative environments. This paper extends the recently proposed weighted double estimator to the multiagent domain and propose a multiagent DRL framework, named weighted double deep Q-network (WDDQN). By utilizing the weighted double estimator and the deep neural network, WDDQN can not only reduce the bias effectively but also be extended to scenarios with raw visual inputs. To achieve efficient cooperation in the multiagent domain, we introduce the lenient reward network and the scheduled replay strategy. Experiments show that the WDDQN outperforms the existing DRL and multiaent DRL algorithms, i.e., double DQN and lenient Q-learning, in terms of the average reward and the convergence rate in stochastic cooperative environments.

* 8 pages, 7 figures
Click to Read Paper
In this paper, we propose a general framework for sparse and low-rank tensor estimation from cubic sketchings. A two-stage non-convex implementation is developed based on sparse tensor decomposition and thresholded gradient descent, which ensures exact recovery in the noiseless case and stable recovery in the noisy case with high probability. The non-asymptotic analysis sheds light on an interplay between optimization error and statistical error. The proposed procedure is shown to be rate-optimal under certain conditions. As a technical by-product, novel high-order concentration inequalities are derived for studying high-moment sub-Gaussian tensors. An interesting tensor formulation illustrates the potential application to high-order interaction pursuit in high-dimensional linear regression.

Click to Read Paper
Person re-identification (ReID) is an important task in computer vision. Recently, deep learning with a metric learning loss has become a common framework for ReID. In this paper, we also propose a new metric learning loss with hard sample mining called margin smaple mining loss (MSML) which can achieve better accuracy compared with other metric learning losses, such as triplet loss. In experi- ments, our proposed methods outperforms most of the state-of-the-art algorithms on Market1501, MARS, CUHK03 and CUHK-SYSU.

* 6 pages
Click to Read Paper
Supervised speech separation uses supervised learning algorithms to learn a mapping from an input noisy signal to an output target. With the fast development of deep learning, supervised separation has become the most important direction in speech separation area in recent years. For the supervised algorithm, training target has a significant impact on the performance. Ideal ratio mask is a commonly used training target, which can improve the speech intelligibility and quality of the separated speech. However, it does not take into account the correlation between noise and clean speech. In this paper, we use the optimal ratio mask as the training target of the deep neural network (DNN) for speech separation. The experiments are carried out under various noise environments and signal to noise ratio (SNR) conditions. The results show that the optimal ratio mask outperforms other training targets in general.

Click to Read Paper
Causation discovery without manipulation is considered a crucial problem to a variety of applications. The state-of-the-art solutions are applicable only when large numbers of samples are available or the problem domain is sufficiently small. Motivated by the observations of the local sparsity properties on causal structures, we propose a general Split-and-Merge framework, named SADA, to enhance the scalability of a wide class of causation discovery algorithms. In SADA, the variables are partitioned into subsets, by finding causal cut on the sparse causal structure over the variables. By running mainstream causation discovery algorithms as basic causal solvers on the subproblems, complete causal structure can be reconstructed by combining the partial results. SADA benefits from the recursive division technique, since each small subproblem generates more accurate result under the same number of samples. We theoretically prove that SADA always reduces the scales of problems without sacrifice on accuracy, under the condition of local causal sparsity and reliable conditional independence tests. We also present sufficient condition to accuracy enhancement by SADA, even when the conditional independence tests are vulnerable. Extensive experiments on both simulated and real-world datasets verify the improvements on scalability and accuracy by applying SADA together with existing causation discovery algorithms.

Click to Read Paper