Models, code, and papers for "Shuai Li":

Measure, Manifold, Learning, and Optimization: A Theory Of Neural Networks

Nov 30, 2018
Shuai Li

We present a formal measure-theoretical theory of neural networks (NN) built on probability coupling theory. Our main contributions are summarized as follows. * Built on the formalism of probability coupling theory, we derive an algorithm framework, named Hierarchical Measure Group and Approximate System (HMGAS), nicknamed S-System, that is designed to learn the complex hierarchical, statistical dependency in the physical world. * We show that NNs are special cases of S-System when the probability kernels assume certain exponential family distributions. Activation Functions are derived formally. We further endow geometry on NNs through information geometry, show that intermediate feature spaces of NNs are stochastic manifolds, and prove that "distance" between samples is contracted as layers stack up. * S-System shows NNs are inherently stochastic, and under a set of realistic boundedness and diversity conditions, it enables us to prove that for large size nonlinear deep NNs with a class of losses, including the hinge loss, all local minima are global minima with zero loss errors, and regions around the minima are flat basins where all eigenvalues of Hessians are concentrated around zero, using tools and ideas from mean field theory, random matrix theory, and nonlinear operator equations. * S-System, the information-geometry structure and the optimization behaviors combined completes the analog between Renormalization Group (RG) and NNs. It shows that a NN is a complex adaptive system that estimates the statistic dependency of microscopic object, e.g., pixels, in multiple scales. Unlike clear-cut physical quantity produced by RG in physics, e.g., temperature, NNs renormalize/recompose manifolds emerging through learning/optimization that divide the sample space into highly semantically meaningful groups that are dictated by supervised labels (in supervised NNs).


  Click for Model/Code and Paper
Akid: A Library for Neural Network Research and Production from a Dataism Approach

Jan 03, 2017
Shuai Li

Neural networks are a revolutionary but immature technique that is fast evolving and heavily relies on data. To benefit from the newest development and newly available data, we want the gap between research and production as small as possibly. On the other hand, differing from traditional machine learning models, neural network is not just yet another statistic model, but a model for the natural processing engine --- the brain. In this work, we describe a neural network library named {\texttt akid}. It provides higher level of abstraction for entities (abstracted as blocks) in nature upon the abstraction done on signals (abstracted as tensors) by Tensorflow, characterizing the dataism observation that all entities in nature processes input and emit out in some ways. It includes a full stack of software that provides abstraction to let researchers focus on research instead of implementation, while at the same time the developed program can also be put into production seamlessly in a distributed environment, and be production ready. At the top application stack, it provides out-of-box tools for neural network applications. Lower down, akid provides a programming paradigm that lets user easily build customized models. The distributed computing stack handles the concurrency and communication, thus letting models be trained or deployed to a single GPU, multiple GPUs, or a distributed environment without affecting how a model is specified in the programming paradigm stack. Lastly, the distributed deployment stack handles how the distributed computing is deployed, thus decoupling the research prototype environment with the actual production environment, and is able to dynamically allocate computing resources, so development (Devs) and operations (Ops) could be separated. Please refer to http://akid.readthedocs.io/en/latest/ for documentation.


  Click for Model/Code and Paper
Hyperbolic Recommender Systems

Sep 05, 2018
Tran Dang Quang Vinh, Yi Tay, Shuai Zhang, Gao Cong, Xiao-Li Li

Many well-established recommender systems are based on representation learning in Euclidean space. In these models, matching functions such as the Euclidean distance or inner product are typically used for computing similarity scores between user and item embeddings. This paper investigates the notion of learning user and item representations in Hyperbolic space. In this paper, we argue that Hyperbolic space is more suitable for learning user-item embeddings in the recommendation domain. Unlike Euclidean spaces, Hyperbolic spaces are intrinsically equipped to handle hierarchical structure, encouraged by its property of exponentially increasing distances away from origin. We propose HyperBPR (Hyperbolic Bayesian Personalized Ranking), a conceptually simple but highly effective model for the task at hand. Our proposed HyperBPR not only outperforms their Euclidean counterparts, but also achieves state-of-the-art performance on multiple benchmark datasets, demonstrating the effectiveness of personalized recommendation in Hyperbolic space.


  Click for Model/Code and Paper
Communications and Networking Technologies for Intelligent Drone Cruisers

Sep 25, 2019
Li-Chun Wang, Chuan-Chi Lai, Hong-Han Shuai, Hsin-Piao Lin, Chi-Yu Li, Teng-Hu Cheng, Chiun-Hsun Chen

Future mobile communication networks require an Aerial Base Station (ABS) with fast mobility and long-term hovering capabilities. At present, unmanned aerial vehicles (UAV) or drones do not have long flight times and are mainly used for monitoring, surveillance, and image post-processing. On the other hand, the traditional airship is too large and not easy to take off and land. Therefore, we propose to develop an "Artificial Intelligence (AI) Drone-Cruiser" base station that can help 5G mobile communication systems and beyond quickly recover the network after a disaster and handle the instant communications by the flash crowd. The drone-cruiser base station can overcome the communications problem for three types of flash crowds, such as in stadiums, parades, and large plaza so that an appropriate number of aerial base stations can be accurately deployed to meet large and dynamic traffic demands. Artificial intelligence can solve these problems by analyzing the collected data, and then adjust the system parameters in the framework of Self-Organizing Network (SON) to achieve the goals of self-configuration, self-optimization, and self-healing. With the help of AI technologies, 5G networks can become more intelligent. This paper aims to provide a new type of service, On-Demand Aerial Base Station as a Service. This work needs to overcome the following five technical challenges: innovative design of drone-cruisers for the long-time hovering, crowd estimation and prediction, rapid 3D wireless channel learning and modeling, 3D placement of aerial base stations and the integration of WiFi front-haul and millimeter wave/WiGig back-haul networks.

* 6 pages, 11 figures, accepted by 2019 IEEE Globecom Workshops (GC Wkshps): IEEE GLOBECOM 2019 Workshop on Space-Ground Integrated Networks 

  Click for Model/Code and Paper
Learning to Synthesize: Robust Phase Retrieval at Low Photon counts

Jul 26, 2019
Mo Deng, Shuai Li, Alexandre Goy, Iksung Kang, George Barbastathis

The quality of inverse problem solutions obtained through deep learning [Barbastathis et al, 2019] is limited by the nature of the priors learned from examples presented during the training phase. In the case of quantitative phase retrieval [Sinha et al, 2017, Goy et al, 2019], in particular, spatial frequencies that are underrepresented in the training database, most often at the high band, tend to be suppressed in the reconstruction. Ad hoc solutions have been proposed, such as pre-amplifying the high spatial frequencies in the examples [Li et al, 2018]; however, while that strategy improves resolution, it also leads to high-frequency artifacts as well as low-frequency distortions in the reconstructions. Here, we present a new approach that learns separately how to handle the two frequency bands, low and high; and also learns how to synthesize these two bands into the full-band reconstructions. We show that this "learning to synthesize" (LS) method yields phase reconstructions of high spatial resolution and artifact-free; and it is also resilient to high-noise conditions, e.g. in the case of very low photon flux. In addition to the problem of quantitative phase retrieval, the LS method is applicable, in principle, to any inverse problem where the forward operator treats different frequency bands unevenly, i.e. is ill-posed.


  Click for Model/Code and Paper
Online Clustering of Contextual Cascading Bandits

Nov 23, 2017
Shuai Li, Shengyu Zhang

We consider a new setting of online clustering of contextual cascading bandits, an online learning problem where the underlying cluster structure over users is unknown and needs to be learned from a random prefix feedback. More precisely, a learning agent recommends an ordered list of items to a user, who checks the list and stops at the first satisfactory item, if any. We propose an algorithm of CLUB-cascade for this setting and prove an $n$-step regret bound of order $\tilde{O}(\sqrt{n})$. Previous work corresponds to the degenerate case of only one cluster, and our general regret bound in this special case also significantly improves theirs. We conduct experiments on both synthetic and real data, and demonstrate the effectiveness of our algorithm and the advantage of incorporating online clustering method.


  Click for Model/Code and Paper
Beetle Antennae Search without Parameter Tuning (BAS-WPT) for Multi-objective Optimization

Nov 07, 2017
Xiangyuan Jiang, Shuai Li

Beetle antennae search (BAS) is an efficient meta-heuristic algorithm inspired by foraging behaviors of beetles. This algorithm includes several parameters for tuning and the existing results are limited to solve single objective optimization. This work pushes forward the research on BAS by providing one variant that releases the tuning parameters and is able to handle multi-objective optimization. This new approach applies normalization to simplify the original algorithm and uses a penalty function to exploit infeasible solutions with low constraint violation to solve the constraint optimization problem. Extensive experimental studies are carried out and the results reveal efficacy of the proposed approach to constraint handling.


  Click for Model/Code and Paper
BAS: Beetle Antennae Search Algorithm for Optimization Problems

Oct 30, 2017
Xiangyuan Jiang, Shuai Li

Meta-heuristic algorithms have become very popular because of powerful performance on the optimization problem. A new algorithm called beetle antennae search algorithm (BAS) is proposed in the paper inspired by the searching behavior of longhorn beetles. The BAS algorithm imitates the function of antennae and the random walking mechanism of beetles in nature, and then two main steps of detecting and searching are implemented. Finally, the algorithm is benchmarked on 2 well-known test functions, in which the numerical results validate the efficacy of the proposed BAS algorithm.

* 3 pages, 3 figures 

  Click for Model/Code and Paper
PSA: A novel optimization algorithm based on survival rules of porcellio scaber

Sep 28, 2017
Yinyan Zhang, Shuai Li

Bio-inspired algorithms have received a significant amount of attention in both academic and engineering societies. In this paper, based on the observation of two major survival rules of a species of woodlice, i.e., porcellio scaber, we design and propose an algorithm called the porcellio scaber algorithm (PSA) for solving optimization problems, including differentiable and non-differential ones as well as the case with local optimums. Numerical results based on benchmark problems are presented to validate the efficacy of PSA.

* 3 pages, 4 figures 

  Click for Model/Code and Paper
Context-Aware Bandits

Feb 26, 2017
Shuai Li, Purushottam Kar

We propose an efficient Context-Aware clustering of Bandits (CAB) algorithm, which can capture collaborative effects. CAB can be easily deployed in a real-world recommendation system, where multi-armed bandits have been shown to perform well in particular with respect to the cold-start problem. CAB utilizes a context-aware clustering augmented by exploration-exploitation strategies. CAB dynamically clusters the users based on the content universe under consideration. We give a theoretical analysis in the standard stochastic multi-armed bandits setting. We show the efficiency of our approach on production and real-world datasets, demonstrate the scalability, and, more importantly, the significant increased prediction performance against several state-of-the-art methods.

* The paper has been withdrawn as the work has been superseded 

  Click for Model/Code and Paper
Adaptive fractional order graph neural network

Jan 05, 2020
Zijian Liu, Chunbo Luo, Shuai Li

This paper proposes adaptive fractional order graph neural network (AFGNN), optimized by a time-varying fractional order gradient descent method to address the challenges of local optimum of classic and fractional GNNs which are specialised at aggregating information from the feature and adjacent matrices of connected nodes and their neighbours to solve learning tasks on non-Euclidean data such as graphs. To overcome the high computational complexity of fractional order derivations, the proposed model approximately calculates the fractional order gradients. We further prove such approximation is feasible and the AFGNN is unbiased. Extensive experiments on benchmark citation networks and object recognition challenges confirm the performance of AFGNN. The first group of experiments show that the results of AFGNN outperform the steepest gradient based method and conventional GNNs on the citation networks. The second group of experiments demonstrate that AFGNN excels at image recognition tasks where the images have a significant amount of missing pixels and expresses improved accuracy than GNNs.


  Click for Model/Code and Paper
Contextual Combinatorial Conservative Bandits

Nov 26, 2019
Xiaojin Zhang, Shuai Li, Weiwen Liu

The problem of multi-armed bandits (MAB) asks to make sequential decisions while balancing between exploitation and exploration, and have been successfully applied to a wide range of practical scenarios. Various algorithms have been designed to achieve a high reward in a long term. However, its short-term performance might be rather low, which is injurious in risk sensitive applications. Building on previous work of conservative bandits, we bring up a framework of contextual combinatorial conservative bandits. An algorithm is presented and a regret bound of $\tilde O(d^2+d\sqrt{T})$ is proven, where $d$ is the dimension of the feature vectors, and $T$ is the total number of time steps. We further provide an algorithm as well as regret analysis for the case when the conservative reward is unknown. Experiments are conducted, and the results validate the effectiveness of our algorithm.


  Click for Model/Code and Paper
Convergence analysis of beetle antennae search algorithm and its applications

Apr 04, 2019
Yinyan Zhang, Shuai Li, Bin Xu

The beetle antennae search algorithm was recently proposed and investigated for solving global optimization problems. Although the performance of the algorithm and its variants were shown to be better than some existing meta-heuristic algorithms, there is still a lack of convergence analysis. In this paper, we provide theoretical analysis on the convergence of the beetle antennae search algorithm. We test the performance of the BAS algorithm via some representative benchmark functions. Meanwhile, some applications of the BAS algorithm are also presented.

* no 

  Click for Model/Code and Paper
Learning to synthesize: splitting and recombining low and high spatial frequencies for image recovery

Nov 19, 2018
Mo Deng, Shuai Li, George Barbastathis

Deep Neural Network (DNN)-based image reconstruction, despite many successes, often exhibits uneven fidelity between high and low spatial frequency bands. In this paper we propose the Learning Synthesis by DNN (LS-DNN) approach where two DNNs process the low and high spatial frequencies, respectively, and, improving over [30], the two DNNs are trained separately and a third DNN combines them into an image with high fidelity at all bands. We demonstrate LS-DNN in two canonical inverse problems: super-resolution (SR) in diffraction-limited imaging (DLI), and quantitative phase retrieval (QPR). Our results also show comparable or improved performance over perceptual-loss based SR [21], and can be generalized to a wider range of image recovery problems.

* 10 pages, 10 figures. Supplement file can be provided upon reasonable request 

  Click for Model/Code and Paper
Porcellio scaber algorithm (PSA) for solving constrained optimization problems

Oct 11, 2017
Yinyan Zhang, Shuai Li, Hongliang Guo

In this paper, we extend a bio-inspired algorithm called the porcellio scaber algorithm (PSA) to solve constrained optimization problems, including a constrained mixed discrete-continuous nonlinear optimization problem. Our extensive experiment results based on benchmark optimization problems show that the PSA has a better performance than many existing methods or algorithms. The results indicate that the PSA is a promising algorithm for constrained optimization.

* 6 pages, 1 figure 

  Click for Model/Code and Paper
Automatic Discoveries of Physical and Semantic Concepts via Association Priors of Neuron Groups

Jan 12, 2017
Shuai Li, Kui Jia, Xiaogang Wang

The recent successful deep neural networks are largely trained in a supervised manner. It {\it associates} complex patterns of input samples with neurons in the last layer, which form representations of {\it concepts}. In spite of their successes, the properties of complex patterns associated a learned concept remain elusive. In this work, by analyzing how neurons are associated with concepts in supervised networks, we hypothesize that with proper priors to regulate learning, neural networks can automatically associate neurons in the intermediate layers with concepts that are aligned with real world concepts, when trained only with labels that associate concepts with top level neurons, which is a plausible way for unsupervised learning. We develop a prior to verify the hypothesis and experimentally find the proposed prior help neural networks automatically learn both basic physical concepts at the lower layers, e.g., rotation of filters, and highly semantic concepts at the higher layers, e.g., fine-grained categories of an entry-level category.


  Click for Model/Code and Paper
Distributed Clustering of Linear Bandits in Peer to Peer Networks

Jun 07, 2016
Nathan Korda, Balazs Szorenyi, Shuai Li

We provide two distributed confidence ball algorithms for solving linear bandit problems in peer to peer networks with limited communication capabilities. For the first, we assume that all the peers are solving the same linear bandit problem, and prove that our algorithm achieves the optimal asymptotic regret rate of any centralised algorithm that can instantly communicate information between the peers. For the second, we assume that there are clusters of peers solving the same bandit problem within each cluster, and we prove that our algorithm discovers these clusters, while achieving the optimal asymptotic regret rate within each one. Through experiments on several real-world datasets, we demonstrate the performance of proposed algorithms compared to the state-of-the-art.

* The 33rd ICML, 2016 

  Click for Model/Code and Paper
Collaborative Filtering Bandits

May 31, 2016
Shuai Li, Alexandros Karatzoglou, Claudio Gentile

Classical collaborative filtering, and content-based filtering methods try to learn a static recommendation model given training data. These approaches are far from ideal in highly dynamic recommendation domains such as news recommendation and computational advertisement, where the set of items and users is very fluid. In this work, we investigate an adaptive clustering technique for content recommendation based on exploration-exploitation strategies in contextual multi-armed bandit settings. Our algorithm takes into account the collaborative effects that arise due to the interaction of the users with the items, by dynamically grouping users based on the items under consideration and, at the same time, grouping items based on the similarity of the clusterings induced over the users. The resulting algorithm thus takes advantage of preference patterns in the data in a way akin to collaborative filtering methods. We provide an empirical analysis on medium-size real-world datasets, showing scalability and increased prediction performance (as measured by click-through rate) over state-of-the-art methods for clustering bandits. We also provide a regret analysis within a standard linear stochastic noise setting.

* The 39th SIGIR (SIGIR 2016) 

  Click for Model/Code and Paper
Graph Clustering Bandits for Recommendation

May 02, 2016
Shuai Li, Claudio Gentile, Alexandros Karatzoglou

We investigate an efficient context-dependent clustering technique for recommender systems based on exploration-exploitation strategies through multi-armed bandits over multiple users. Our algorithm dynamically groups users based on their observed behavioral similarity during a sequence of logged activities. In doing so, the algorithm reacts to the currently served user by shaping clusters around him/her but, at the same time, it explores the generation of clusters over users which are not currently engaged. We motivate the effectiveness of this clustering policy, and provide an extensive empirical analysis on real-world datasets, showing scalability and improved prediction performance over state-of-the-art methods for sequential clustering of users in multi-armed bandit scenarios.


  Click for Model/Code and Paper
Online Clustering of Bandits

Jun 06, 2014
Claudio Gentile, Shuai Li, Giovanni Zappella

We introduce a novel algorithmic approach to content recommendation based on adaptive clustering of exploration-exploitation ("bandit") strategies. We provide a sharp regret analysis of this algorithm in a standard stochastic noise setting, demonstrate its scalability properties, and prove its effectiveness on a number of artificial and real-world datasets. Our experiments show a significant increase in prediction performance over state-of-the-art methods for bandit problems.

* In E. Xing and T. Jebara (Eds.), Proceedings of 31st International Conference on Machine Learning, Journal of Machine Learning Research Workshop and Conference Proceedings, Vol.32 (JMLR W&CP-32), Beijing, China, Jun. 21-26, 2014 (ICML 2014), Submitted by Shuai Li (https://sites.google.com/site/shuailidotsli) 

  Click for Model/Code and Paper