Mutual information is widely used in artificial intelligence, in a descriptive way, to measure the stochastic dependence of discrete random variables. In order to address questions such as the reliability of the empirical value, one must consider sample-to-population inferential approaches. This paper deals with the distribution of mutual information, as obtained in a Bayesian framework by a second-order Dirichlet prior distribution. The exact analytical expression for the mean and an analytical approximation of the variance are reported. Asymptotic approximations of the distribution are proposed. The results are applied to the problem of selecting features for incremental learning and classification of the naive Bayes classifier. A fast, newly defined method is shown to outperform the traditional approach based on empirical mutual information on a number of real data sets. Finally, a theoretical development is reported that allows one to efficiently extend the above methods to incomplete samples in an easy and effective way.

* Appears in Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI2002)
Click to Read Paper
This paper is concerned with the reliable inference of optimal tree-approximations to the dependency structure of an unknown distribution generating data. The traditional approach to the problem measures the dependency strength between random variables by the index called mutual information. In this paper reliability is achieved by Walley's imprecise Dirichlet model, which generalizes Bayesian learning with Dirichlet priors. Adopting the imprecise Dirichlet model results in posterior interval expectation for mutual information, and in a set of plausible trees consistent with the data. Reliable inference about the actual tree is achieved by focusing on the substructure common to all the plausible trees. We develop an exact algorithm that infers the substructure in time O(m^4), m being the number of random variables. The new algorithm is applied to a set of data sampled from a known distribution. The method is shown to reliably infer edges of the actual tree even when the data are very scarce, unlike the traditional approach. Finally, we provide lower and upper credibility limits for mutual information under the imprecise Dirichlet model. These enable the previous developments to be extended to a full inferential method for trees.

* Annals of Mathematics and Artificial Intelligence, 45 (2005) 215-239
* 26 pages, 7 figures
Click to Read Paper
Mutual information is widely used, in a descriptive way, to measure the stochastic dependence of categorical random variables. In order to address questions such as the reliability of the descriptive value, one must consider sample-to-population inferential approaches. This paper deals with the posterior distribution of mutual information, as obtained in a Bayesian framework by a second-order Dirichlet prior distribution. The exact analytical expression for the mean, and analytical approximations for the variance, skewness and kurtosis are derived. These approximations have a guaranteed accuracy level of the order O(1/n^3), where n is the sample size. Leading order approximations for the mean and the variance are derived in the case of incomplete samples. The derived analytical expressions allow the distribution of mutual information to be approximated reliably and quickly. In fact, the derived expressions can be computed with the same order of complexity needed for descriptive mutual information. This makes the distribution of mutual information become a concrete alternative to descriptive mutual information in many applications which would benefit from moving to the inductive side. Some of these prospective applications are discussed, and one of them, namely feature selection, is shown to perform significantly better when inductive mutual information is used.

* Computational Statistics & Data Analysis, Vol.48, No.3, March 2005, pages 633--657
* 26 pages, LaTeX, 5 figures, 4 tables
Click to Read Paper
Given the joint chances of a pair of random variables one can compute quantities of interest, like the mutual information. The Bayesian treatment of unknown chances involves computing, from a second order prior distribution and the data likelihood, a posterior distribution of the chances. A common treatment of incomplete data is to assume ignorability and determine the chances by the expectation maximization (EM) algorithm. The two different methods above are well established but typically separated. This paper joins the two approaches in the case of Dirichlet priors, and derives efficient approximations for the mean, mode and the (co)variance of the chances and the mutual information. Furthermore, we prove the unimodality of the posterior distribution, whence the important property of convergence of EM to the global maximum in the chosen framework. These results are applied to the problem of selecting features for incremental learning and naive Bayes classification. A fast filter based on the distribution of mutual information is shown to outperform the traditional filter based on empirical mutual information on a number of incomplete real data sets.

* Proceedings of the 26th German Conference on Artificial Intelligence (KI-2003) 396-406
* 11 pages, 1 figure
Click to Read Paper
The ability to recover from a fall is an essential feature for a legged robot to navigate in challenging environments robustly. Until today, there has been very little progress on this topic. Current solutions mostly build upon (heuristically) predefined trajectories, resulting in unnatural behaviors and requiring considerable effort in engineering system-specific components. In this paper, we present an approach based on model-free Deep Reinforcement Learning (RL) to control recovery maneuvers of quadrupedal robots using a hierarchical behavior-based controller. The controller consists of four neural network policies including three behaviors and one behavior selector to coordinate them. Each of them is trained individually in simulation and deployed directly on a real system. We experimentally validate our approach on the quadrupedal robot ANYmal, which is a dog-sized quadrupedal system with 12 degrees of freedom. With our method, ANYmal manifests dynamic and reactive recovery behaviors to recover from an arbitrary fall configuration within less than 5 seconds. We tested the recovery maneuver more than 100 times, and the success rate was higher than 97 %.

Click to Read Paper
This paper presents design and experimental evaluations of an articulated robotic limb called Capler-Leg. The key element of Capler-Leg is its single-stage cable-pulley transmission combined with a high-gap radius motor. Our cable-pulley system is designed to be as light-weight as possible and to additionally serve as the primary cooling element, thus significantly increasing the power density and efficiency of the overall system. The total weight of active elements on the leg, i.e. the stators and the rotors, contribute more than 60% of the total leg weight, which is an order of magnitude higher than most existing robots. The resulting robotic leg has low inertia, high torque transparency, low manufacturing cost, no backlash, and a low number of parts. Capler-Leg system itself, serves as an experimental setup for evaluating the proposed cable- pulley design in terms of robustness and efficiency. A continuous jump experiment shows a remarkable 96.5 % recuperation rate, measured at the battery output. This means that almost all the mechanical energy output used during push-off returned back to the battery during touch-down.

Click to Read Paper
In this paper, we present a method to control a quadrotor with a neural network trained using reinforcement learning techniques. With reinforcement learning, a common network can be trained to directly map state to actuator command making any predefined control structure obsolete for training. Moreover, we present a new learning algorithm which differs from the existing ones in certain aspects. Our algorithm is conservative but stable for complicated tasks. We found that it is more applicable to controlling a quadrotor than existing algorithms. We demonstrate the performance of the trained policy both in simulation and with a real quadrotor. Experiments show that our policy network can react to step response relatively accurately. With the same policy, we also demonstrate that we can stabilize the quadrotor in the air even under very harsh initialization (manually throwing it upside-down in the air with an initial velocity of 5 m/s). Computation time of evaluating the policy is only 7 {\mu}s per time step which is two orders of magnitude less than common trajectory optimization algorithms with an approximated model.

Click to Read Paper
In this paper, we consider the coherent theory of (epistemic) uncertainty of Walley, in which beliefs are represented through sets of probability distributions, and we focus on the problem of modeling prior ignorance about a categorical random variable. In this setting, it is a known result that a state of prior ignorance is not compatible with learning. To overcome this problem, another state of beliefs, called \emph{near-ignorance}, has been proposed. Near-ignorance resembles ignorance very closely, by satisfying some principles that can arguably be regarded as necessary in a state of ignorance, and allows learning to take place. What this paper does, is to provide new and substantial evidence that also near-ignorance cannot be really regarded as a way out of the problem of starting statistical inference in conditions of very weak beliefs. The key to this result is focusing on a setting characterized by a variable of interest that is \emph{latent}. We argue that such a setting is by far the most common case in practice, and we provide, for the case of categorical latent variables (and general \emph{manifest} variables) a condition that, if satisfied, prevents learning to take place under prior near-ignorance. This condition is shown to be easily satisfied even in the most common statistical problems. We regard these results as a strong form of evidence against the possibility to adopt a condition of prior near-ignorance in real statistical problems.

* International Journal of Approximate Reasoning, 50:4 (2009) pages 597-611
* 27 LaTeX pages
Click to Read Paper
Autonomous mobile manipulation is the cutting edge of the modern robotic technology, which offers a dual advantage of mobility provided by a mobile platform and dexterity afforded by the manipulator. A common approach for controlling these systems is based on the task space control. In a nutshell, a task space controller defines a map from user-defined end-effector references to the actuation commands based on an optimization problem over the distance between the reference trajectories and the physically consistent motions. The optimization however ignores the effect of the current decision on the future error, which limits the applicability of the approach for dynamically stable platforms. On the contrary, the Model Predictive Control (MPC) approach offers the capability of foreseeing the future and making a trade-off in between the current and future tracking errors. Here, we transcribe the task at the end-effector space, which makes the task description more natural for the user. Furthermore, we show how the MPC-based controller skillfully incorporates the reference forces at the end-effector in the control problem. To this end, we showcase here the advantages of using this MPC approach for controlling a ball-balancing mobile manipulator, Rezero. We validate our controller on the hardware for tasks such as end-effector pose tracking and door opening.

Click to Read Paper
Locomotion planning for legged systems requires reasoning about suitable contact schedules. The contact sequence and timings constitute a hybrid dynamical system and prescribe a subset of achievable motions. State-of-the-art approaches cast motion planning as an optimal control problem. In order to decrease computational complexity, one common strategy separates footstep planning from motion optimization and plans contacts using heuristics. In this paper, we propose to learn contact schedule selection from high-level task descriptors using Bayesian optimization. A bi-level optimization is defined in which a Gaussian process model predicts the performance of trajectories generated by a motion planning nonlinear program. The agent, therefore, retains the ability to reason about suitable contact schedules, while explicit computation of the corresponding gradients is avoided. We delineate the algorithm in its general form and provide results for planning single-legged hopping. Our method is capable of learning contact schedule transitions that align with human intuition. It performs competitively against a heuristic baseline in predicting task appropriate contact schedules.

* Accepted for publication at the IEEE International Conference on Robotics and Automation (ICRA) 2019
Click to Read Paper
Transferring solutions found by trajectory optimization to robotic hardware remains a challenging task. When the optimization fully exploits the provided model to perform dynamic tasks, the presence of unmodeled dynamics renders the motion infeasible on the real system. Model errors can be a result of model simplifications, but also naturally arise when deploying the robot in unstructured and nondeterministic environments. Predominantly, compliant contacts and actuator dynamics lead to bandwidth limitations. While classical control methods provide tools to synthesize controllers that are robust to a class of model errors, such a notion is missing in modern trajectory optimization, which is solved in the time domain. We propose frequency-shaped cost functions to achieve robust solutions in the context of optimal control for legged robots. Through simulation and hardware experiments we show that motion plans can be made compatible with bandwidth limits set by actuators and contact dynamics. The smoothness of the model predictive solutions can be continuously tuned without compromising the feasibility of the problem. Experiments with the quadrupedal robot ANYmal, which is driven by highly-compliant series elastic actuators, showed significantly improved tracking performance of the planned motion, torque, and force trajectories and enabled the machine to walk robustly on terrain with unmodeled compliance.

* IEEE Robotics and Automation Letters 2019
Click to Read Paper
Legged robots have the ability to adapt their walking posture to navigate confined spaces due to their high degrees of freedom. However, this has not been exploited in most common multilegged platforms. This paper presents a deformable bounding box abstraction of the robot model, with accompanying mapping and planning strategies, that enable a legged robot to autonomously change its body shape to navigate confined spaces. The mapping is achieved using robot-centric multi-elevation maps generated with distance sensors carried by the robot. The path planning is based on the trajectory optimisation algorithm CHOMP which creates smooth trajectories while avoiding obstacles. The proposed method has been tested in simulation and implemented on the hexapod robot Weaver, which is 33cm tall and 82cm wide when walking normally. We demonstrate navigating under 25cm overhanging obstacles, through 70cm wide gaps and over 22cm high obstacles in both artificial testing spaces and realistic environments, including a subterranean mining tunnel.

* IEEE RA-L/ICRA2019
Click to Read Paper
Legged robots pose one of the greatest challenges in robotics. Dynamic and agile maneuvers of animals cannot be imitated by existing methods that are crafted by humans. A compelling alternative is reinforcement learning, which requires minimal craftsmanship and promotes the natural evolution of a control policy. However, so far, reinforcement learning research for legged robots is mainly limited to simulation, and only few and comparably simple examples have been deployed on real systems. The primary reason is that training with real robots, particularly with dynamically balancing systems, is complicated and expensive. In the present work, we introduce a method for training a neural network policy in simulation and transferring it to a state-of-the-art legged system, thereby leveraging fast, automated, and cost-effective data generation schemes. The approach is applied to the ANYmal robot, a sophisticated medium-dog-sized quadrupedal system. Using policies trained in simulation, the quadrupedal machine achieves locomotion skills that go beyond what had been achieved with prior methods: ANYmal is capable of precisely and energy-efficiently following high-level body velocity commands, running faster than before, and recovering from falling even in complex configurations.

* Science Robotics 4.26 (2019): eaau5872
Click to Read Paper
In this work we present a whole-body Nonlinear Model Predictive Control approach for Rigid Body Systems subject to contacts. We use a full dynamic system model which also includes explicit contact dynamics. Therefore, contact locations, sequences and timings are not prespecified but optimized by the solver. Yet, thorough numerical and software engineering allows for running the nonlinear Optimal Control solver at rates up to 190 Hz on a quadruped for a time horizon of half a second. This outperforms the state of the art by at least one order of magnitude. Hardware experiments in form of periodic and non-periodic tasks are applied to two quadrupeds with different actuation systems. The obtained results underline the performance, transferability and robustness of the approach.

* Submitted to "Robotics and Automation: Letters" / "International Conference on Robotics and Automation 2018"
Click to Read Paper
We show dynamic locomotion strategies for wheeled quadrupedal robots, which combine the advantages of both walking and driving. The developed optimization framework tightly integrates the additional degrees of freedom introduced by the wheels. Our approach relies on a zero-moment point based motion optimization which continuously updates reference trajectories. The reference motions are tracked by a hierarchical whole-body controller which computes optimal generalized accelerations and contact forces by solving a sequence of prioritized tasks including the nonholonomic rolling constraints. Our approach has been tested on ANYmal, a quadrupedal robot that is fully torque-controlled including the non-steerable wheels attached to its legs. We conducted experiments on flat and inclined terrains as well as over steps, whereby we show that integrating the wheels into the motion control and planning framework results in intuitive motion trajectories, which enable more robust and dynamic locomotion compared to other wheeled-legged robots. Moreover, with a speed of 4 m/s and a reduction of the cost of transport by 83 % we prove the superiority of wheeled-legged robots compared to their legged counterparts.

* IEEE Robotics and Automation Letters 2019
* IEEE Robotics and Automation Letters
Click to Read Paper
The proper handling of 3D orientations is a central element in many optimization problems in engineering. Unfortunately many researchers and engineers struggle with the formulation of such problems and often fall back to suboptimal solutions. The existence of many different conventions further complicates this issue, especially when interfacing multiple differing implementations. This document discusses an alternative approach which makes use of a more abstract notion of 3D orientations. The relative orientation between two coordinate systems is primarily identified by the coordinate mapping it induces. This is combined with the standard exponential map in order to introduce representation-independent and minimal differentials, which are very convenient in optimization based methods.

Click to Read Paper