Research papers and code for "Ram Vasudevan":
In applications such as autonomous driving, it is important to understand, infer, and anticipate the intention and future behavior of pedestrians. This ability allows vehicles to avoid collisions and improve ride safety and quality. This paper proposes a biomechanically inspired recurrent neural network (Bio-LSTM) that can predict the location and 3D articulated body pose of pedestrians in a global coordinate frame, given 3D poses and locations estimated in prior frames with inaccuracy. The proposed network is able to predict poses and global locations for multiple pedestrians simultaneously, for pedestrians up to 45 meters from the cameras (urban intersection scale). The outputs of the proposed network are full-body 3D meshes represented in Skinned Multi-Person Linear (SMPL) model parameters. The proposed approach relies on a novel objective function that incorporates the periodicity of human walking (gait), the mirror symmetry of the human body, and the change of ground reaction forces in a human gait cycle. This paper presents prediction results on the PedX dataset, a large-scale, in-the-wild data set collected at real urban intersections with heavy pedestrian traffic. Results show that the proposed network can successfully learn the characteristics of pedestrian gait and produce accurate and consistent 3D pose predictions.

* IEEE Robotics and Automation Letters, vol. 4, no. 2, pp. 1501-1508, April 2019
Click to Read Paper and Get Code
Soft robots are challenging to model due to their nonlinear behavior. However, their soft bodies make it possible to safely observe their behavior under random control inputs, making them amenable to large-scale data collection and system identification. This paper implements and evaluates a system identification method based on Koopman operator theory. This theory offers a way to represent a nonlinear system as a linear system in the infinite-dimensional space of real-valued functions called observables, enabling models of nonlinear systems to be constructed via linear regression of observed data. The approach does not suffer from some of the shortcomings of other nonlinear system identification methods, which typically require the manual tuning of training parameters and have limited convergence guarantees. A dynamic model of a pneumatic soft robot arm is constructed via this method, and used to predict the behavior of the real system. The total normalized-root-mean-square error (NRMSE) of its predictions over twelve validation trials is lower than that of several other identified models including a neural network, NLARX, nonlinear Hammerstein-Wiener, and linear state space model.

* Submitted to ICRA 2019, under review
Click to Read Paper and Get Code
This paper presents an interconnected control-planning strategy for redundant manipulators, subject to system and environmental constraints. The method incorporates low-level control characteristics and high-level planning components into a robust strategy for manipulators acting in complex environments, subject to joint limits. This strategy is formulated using an adaptive control rule, the estimated dynamic model of the robotic system and the nullspace of the linearized constraints. A path is generated that takes into account the capabilities of the platform. The proposed method is computationally efficient, enabling its implementation on a real multi-body robotic system. Through experimental results with a 7 DOF manipulator, we demonstrate the performance of the method in real-world scenarios.

Click to Read Paper and Get Code
Safety guarantees are valuable in the control of walking robots, as falling can be both dangerous and costly. Unfortunately, set-based tools for generating safety guarantees (such as sums-of-squares optimization) are typically restricted to simplified, low-dimensional models of walking robots. For more complex models, methods based on hybrid zero dynamics can ensure the local stability of a pre-specified limit cycle, but provide limited guarantees. This paper combines the benefits of both approaches by using sums-of-squares optimization on a hybrid zero dynamics manifold to generate a guaranteed safe set for a 10-dimensional walking robot model. Along with this set, this paper describes how to generate a controller that maintains safety by modifying the manifold parameters when on the edge of the safe set. The proposed approach, which is applied to a bipedal Rabbit model, provides a roadmap for applying sums-of-squares verification techniques to high dimensional systems. This opens the door for a broad set of tools that can generate safety guarantees and regulating controllers for complex walking robot models.

* Submitted to RA-Letters/IROS 2019
Click to Read Paper and Get Code
Navigating safely in urban environments remains a challenging problem for autonomous vehicles. Occlusion and limited sensor range can pose significant challenges to safely navigate among pedestrians and other vehicles in the environment. Enabling vehicles to quantify the risk posed by unseen regions allows them to anticipate future possibilities, resulting in increased safety and ride comfort. This paper proposes an algorithm that takes advantage of the known road layouts to forecast, quantify, and aggregate risk associated with occlusions and limited sensor range. This allows us to make predictions of risk induced by unobserved vehicles even in heavily occluded urban environments. The risk can then be used either by a low-level planning algorithm to generate better trajectories, or by a high-level one to plan a better route. The proposed algorithm is evaluated on intersection layouts from real-world map data with up to five other vehicles in the scene, and verified to reduce collision rates by 4.8x comparing to a baseline method while improving driving comfort.

Click to Read Paper and Get Code
Locomotion in the real world involves unexpected perturbations, and therefore requires strategies to maintain stability to successfully execute desired behaviours. Ensuring the safety of locomoting systems therefore necessitates a quantitative metric for stability. Due to the difficulty of determining the set of perturbations that induce failure, researchers have used a variety of features as a proxy to describe stability. This paper utilises recent advances in dynamical systems theory to develop a personalised, automated framework to compute the set of perturbations from which a system can avoid failure, which is known as the basin of stability. The approach tracks human motion to synthesise a control input that is analysed to measure the basin of stability. The utility of this analysis is verified on a Sit-to-Stand task performed by 15 individuals. The experiment illustrates that the computed basin of stability for each individual can successfully differentiate between less and more stable Sit-to-Stand strategies.

* 11 pages, 9 figures
Click to Read Paper and Get Code
Urban environments pose a significant challenge for autonomous vehicles (AVs) as they must safely navigate while in close proximity to many pedestrians. It is crucial for the AV to correctly understand and predict the future trajectories of pedestrians to avoid collision and plan a safe path. Deep neural networks (DNNs) have shown promising results in accurately predicting pedestrian trajectories, relying on large amounts of annotated real-world data to learn pedestrian behavior. However, collecting and annotating these large real-world pedestrian datasets is costly in both time and labor. This paper describes a novel method using a stochastic sampling-based simulation to train DNNs for pedestrian trajectory prediction with social interaction. Our novel simulation method can generate vast amounts of automatically-annotated, realistic, and naturalistic synthetic pedestrian trajectories based on small amounts of real annotation. We then use such synthetic trajectories to train an off-the-shelf state-of-the-art deep learning approach Social GAN (Generative Adversarial Network) to perform pedestrian trajectory prediction. Our proposed architecture, trained only using synthetic trajectories, achieves better prediction results compared to those trained on human-annotated real-world data using the same network. Our work demonstrates the effectiveness and potential of using simulation as a substitution for human annotation efforts to train high-performing prediction algorithms such as the DNNs.

* 8 pages, 6 figures and 2 tables
Click to Read Paper and Get Code
Controlling soft robots with precision is a challenge due in large part to the difficulty of constructing models that are amenable to model-based control design techniques. Koopman Operator Theory offers a way to construct explicit linear dynamical models of soft robots and to control them using established model-based linear control methods. This method is data-driven, yet unlike other data-driven models such as neural networks, it yields an explicit control-oriented linear model rather than just a "black-box" input-output mapping. This work describes this Koopman-based system identification method and its application to model predictive controller design. A model and MPC controller of a pneumatic soft robot arm was constructed via the method, and its performance was evaluated over several trajectory following tasks in the real-world. On all of the tasks, the Koopman-based MPC controller outperformed a benchmark MPC controller based on a linear state-space model of the same system.

Click to Read Paper and Get Code
The compliant structure of soft robotic systems enables a variety of novel capabilities in comparison to traditional rigid-bodied robots. A subclass of soft fluid-driven actuators known as fiber reinforced elastomeric enclosures (FREEs) is particularly well suited as actuators for these types of systems. FREEs are inherently soft and can impart spatial forces without imposing a rigid structure. Furthermore, they can be configured to produce a large variety of force and moment combinations. In this paper we explore the potential of combining multiple differently configured FREEs in parallel to achieve fully controllable multi-dimensional soft actuation. To this end, we propose a novel methodology to represent and calculate the generalized forces generated by soft actuators as a function of their internal pressure. This methodology relies on the notion of a state dependent fluid Jacobian that yields a linear expression for force. We employ this concept to construct the set of all possible forces that can be generated by a soft system in a given state. This force zonotope can be used to inform the design and control of parallel combinations of soft actuators. The approach is verified experimentally with the parallel combination of three carefully designed actuators constrained to a 2DOF testing rig. The force predictions matched measured values with a root-mean-square error of less than 1.5 N force and 8 x 10^(-3)Nm moment, demonstrating the utility of the presented methodology.

* IEEE Robotics and Automation Letters ( Volume: 3 , Issue: 4 , Oct. 2018 )
* Published in IEEE Robotics and Automation Letters
Click to Read Paper and Get Code
Recent work has focused on generating synthetic imagery to increase the size and variability of training data for learning visual tasks in urban scenes. This includes increasing the occurrence of occlusions or varying environmental and weather effects. However, few have addressed modeling variation in the sensor domain. Sensor effects can degrade real images, limiting generalizability of network performance on visual tasks trained on synthetic data and tested in real environments. This paper proposes an efficient, automatic, physically-based augmentation pipeline to vary sensor effects --chromatic aberration, blur, exposure, noise, and color cast-- for synthetic imagery. In particular, this paper illustrates that augmenting synthetic training datasets with the proposed pipeline reduces the domain gap between synthetic and real domains for the task of object detection in urban driving scenes.

Click to Read Paper and Get Code
Performance on benchmark datasets has drastically improved with advances in deep learning. Still, cross- dataset generalization performance remains relatively low due to the domain shift that can occur between two different datasets. This domain shift is especially exaggerated between synthetic and real datasets. Significant research has been done to reduce this gap, specifically via modeling variation in the spatial layout of a scene, such as occlusions, and scene environmental factors, such as time of day and weather effects. However, few works have addressed modeling the variation in the sensor domain as a means of reducing the synthetic to real domain gap. The camera or sensor used to capture a dataset introduces artifacts into the image data that are unique to the sensor model, suggesting that sensor effects may also contribute to domain shift. To address this, we propose a learned augmentation network composed of physically-based augmentation functions. Our proposed augmentation pipeline transfers specific effects of the sensor model --chromatic aberration, blur, exposure, noise, and color temperature-- from a real dataset to a synthetic dataset. We provide experiments that demonstrate that augmenting synthetic training datasets with the proposed learned augmentation framework reduces the domain gap between synthetic and real domains for object detection in urban driving scenes.

Click to Read Paper and Get Code
Recent work has shown that convolutional neural networks (CNNs) can be applied successfully in disparity estimation, but these methods still suffer from errors in regions of low-texture, occlusions and reflections. Concurrently, deep learning for semantic segmentation has shown great progress in recent years. In this paper, we design a CNN architecture that combines these two tasks to improve the quality and accuracy of disparity estimation with the help of semantic segmentation. Specifically, we propose a network structure in which these two tasks are highly coupled. One key novelty of this approach is the two-stage refinement process. Initial disparity estimates are refined with an embedding learned from the semantic segmentation branch of the network. The proposed model is trained using an unsupervised approach, in which images from one half of the stereo pair are warped and compared against images from the other camera. Another key advantage of the proposed approach is that a single network is capable of outputting disparity estimates and semantic labels. These outputs are of great use in autonomous vehicle operation; with real-time constraints being key, such performance improvements increase the viability of driving applications. Experiments on KITTI and Cityscapes datasets show that our model can achieve state-of-the-art results and that leveraging embedding learned from semantic segmentation improves the performance of disparity estimation.

* 8 pages, 4 figures, 4 tables
Click to Read Paper and Get Code
In this paper, we present an approach for designing feedback controllers for polynomial systems that maximize the size of the time-limited backwards reachable set (BRS). We rely on the notion of occupation measures to pose the synthesis problem as an infinite dimensional linear program (LP) and provide finite dimensional approximations of this LP in terms of semidefinite programs (SDPs). The solution to each SDP yields a polynomial control policy and an outer approximation of the largest achievable BRS. In contrast to traditional Lyapunov based approaches which are non-convex and require feasible initialization, our approach is convex and does not require any form of initialization. The resulting time-varying controllers and approximated reachable sets are well-suited for use in a trajectory library or feedback motion planning algorithm. We demonstrate the efficacy and scalability of our approach on five nonlinear systems.

Click to Read Paper and Get Code
An accurate characterization of pose uncertainty is essential for safe autonomous navigation. Early pose uncertainty characterization methods proposed by Smith, Self, and Cheeseman (SCC), used coordinate-based first-order methods to propagate uncertainty through non-linear functions such as pose composition (head-to-tail), pose inversion, and relative pose extraction (tail-to-tail). Characterizing uncertainty in the Lie Algebra of the special Euclidean group results in better uncertainty estimates. However, existing approaches assume that individual poses are independent. Since factors in a pose graph induce correlation, this independence assumption is usually not reflected in reality. In addition, prior work has focused primarily on the pose composition operation. This paper develops a framework for modeling the uncertainty of jointly distributed poses and describes how to perform the equivalent of the SSC pose operations while characterizing uncertainty in the Lie Algebra. Evaluation on simulated and open-source datasets shows that the proposed methods result in more accurate uncertainty estimates. An accompanying C++ library implementation is also released. This is a pre-print of a paper submitted to IEEE TRO in 2019.

Click to Read Paper and Get Code
An accurate depth map of the environment is critical to the safe operation of autonomous robots and vehicles. Currently, either light detection and ranging (LIDAR) or stereo matching algorithms are used to acquire such depth information. However, a high-resolution LIDAR is expensive and produces sparse depth map at large range; stereo matching algorithms are able to generate denser depth maps but are typically less accurate than LIDAR at long range. This paper combines these approaches together to generate high-quality dense depth maps. Unlike previous approaches that are trained using ground-truth labels, the proposed model adopts a self-supervised training process. Experiments show that the proposed method is able to generate high-quality dense depth maps and performs robustly even with low-resolution inputs. This shows the potential to reduce the cost by using LIDARs with lower resolution in concert with stereo systems while maintaining high resolution.

* 14 pages, 3 figures, 5 tables
Click to Read Paper and Get Code
Autonomous navigation requires an accurate model or map of the environment. While dramatic progress in the prior two decades has enabled large-scale simultaneous localization and mapping (SLAM), the majority of existing methods rely on non-linear optimization techniques to find the maximum likelihood estimate (MLE) of the robot trajectory and surrounding environment. These methods are prone to local minima and are thus sensitive to initialization. Several recent papers have developed optimization algorithms for the Pose-Graph SLAM problem that can certify the optimality of a computed solution. Though this does not guarantee a priori that this approach generates an optimal solution, a recent extension has shown that when the noise lies within a critical threshold that the solution to the optimization algorithm is guaranteed to be optimal. To address the limitations of existing approaches, this paper illustrates that the Pose-Graph SLAM and Landmark SLAM can be formulated as polynomial optimization programs that are sum-of-squares (SOS) convex. This paper then describes how the Pose-Graph and Landmark SLAM problems can be solved to a global minimum without initialization regardless of noise level using the sparse bounded degree sum-of-squares (Sparse-BSOS) optimization method. Finally, the superior performance of the proposed approach when compared to existing SLAM methods is illustrated on graphs with several hundred nodes.

* 7 pages, 5 figures
Click to Read Paper and Get Code
One of the major open challenges in self-driving cars is the ability to detect cars and pedestrians to safely navigate in the world. Deep learning-based object detector approaches have enabled great advances in using camera imagery to detect and classify objects. But for a safety critical application, such as autonomous driving, the error rates of the current state of the art are still too high to enable safe operation. Moreover, the characterization of object detector performance is primarily limited to testing on prerecorded datasets. Errors that occur on novel data go undetected without additional human labels. In this letter, we propose an automated method to identify mistakes made by object detectors without ground truth labels. We show that inconsistencies in the object detector output between a pair of similar images can be used as hypotheses for false negatives (e.g., missed detections) and using a novel set of features for each hypothesis, an off-the-shelf binary classifier can be used to find valid errors. In particular, we study two distinct cues - temporal and stereo inconsistencies - using data that are readily available on most autonomous vehicles. Our method can be used with any camera-based object detector and we illustrate the technique on several sets of real world data. We show that a state-of-the-art detector, tracker, and our classifier trained only on synthetic data can identify valid errors on KITTI tracking dataset with an average precision of 0.94. We also release a new tracking dataset with 104 sequences totaling 80,655 labeled pairs of stereo images along with ground truth disparity from a game engine to facilitate further research. The dataset and code are available at https://fcav.engin.umich.edu/research/failing-to-learn

* 8 pages, 4 figures and 4 tables. Accepted for publication in RA-L and will be presented in IROS 2018 in Madrid, Spain
Click to Read Paper and Get Code
Autonomous mobile robots must operate with limited sensor horizons in unpredictable environments. To do so, they use a receding-horizon strategy to plan trajectories, by executing a short plan while creating the next plan. However, creating safe, dynamically-feasible trajectories in real time is challenging; and, planners must ensure that they are persistently feasible, meaning that a new trajectory is always available before the previous one has finished executing. Existing approaches make a tradeoff between model complexity and planning speed, which can require sacrificing guarantees of safety and dynamic feasibility. This work presents the Reachability-based Trajectory Design (RTD) method for trajectory planning. RTD begins with an offline Forward Reachable Set (FRS) computation of a robot's motion while it tracks parameterized trajectories; the FRS also provably bounds tracking error. At runtime, the FRS is used to map obstacles to the space of parameterized trajectories, which allows RTD to select a safe trajectory at every planning iteration. RTD prescribes a method of representing obstacles to ensure that these constraints can be created and evaluated in real time while maintaining provable safety. Persistent feasibility is achieved by prescribing a minimum duration of planned trajectories, and a minimum sensor horizon. A system decomposition approach is used to increase the dimension of the parameterized trajectories in the FRS, allowing for RTD to create more complex plans at runtime. RTD is compared in simulation with Rapidly-exploring Random Trees (RRT) and Nonlinear Model-Predictive Control (NMPC). RTD is also demonstrated on two hardware platforms in randomly-crafted environments: a differential-drive Segway, and a car-like Rover. The proposed method is shown as safe and persistently feasible across thousands of simulations and dozens of hardware demos.

* The first two authors contributed equally to this work. 58 Pages, 20 Figures
Click to Read Paper and Get Code
The success of autonomous systems will depend upon their ability to safely navigate human-centric environments. This motivates the need for a real-time, probabilistic forecasting algorithm for pedestrians, cyclists, and other agents since these predictions will form a necessary step in assessing the risk of any action. This paper presents a novel approach to probabilistic forecasting for pedestrians based on weighted sums of ordinary differential equations that are learned from historical trajectory information within a fixed scene. The resulting algorithm is embarrassingly parallel and is able to work at real-time speeds using a naive Python implementation. The quality of predicted locations of agents generated by the proposed algorithm is validated on a variety of examples and considerably higher than existing state of the art approaches over long time horizons.

* This is an augmented version of our paper published in RA-L containing additional material that was cut from the paper
Click to Read Paper and Get Code