Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vishnu Dutt Sharma

Pre-Trained Masked Image Model for Mobile Robot Navigation

Oct 10, 2023
Vishnu Dutt Sharma, Anukriti Singh, Pratap Tokekar

2D top-down maps are commonly used for the navigation and exploration of mobile robots through unknown areas. Typically, the robot builds the navigation maps incrementally from local observations using onboard sensors. Recent works have shown that predicting the structural patterns in the environment through learning-based approaches can greatly enhance task efficiency. While many such works build task-specific networks using limited datasets, we show that the existing foundational vision networks can accomplish the same without any fine-tuning. Specifically, we use Masked Autoencoders, pre-trained on street images, to present novel applications for field-of-view expansion, single-agent topological exploration, and multi-agent exploration for indoor mapping, across different input modalities. Our work motivates the use of foundational vision models for generalized structure prediction-driven applications, especially in the dearth of training data. For more qualitative results see https://raaslab.org/projects/MIM4Robots.

Via

Access Paper or Ask Questions

ProxMaP: Proximal Occupancy Map Prediction for Efficient Indoor Robot Navigation

May 10, 2023
Vishnu Dutt Sharma, Jingxi Chen, Pratap Tokekar

Figure 1 for ProxMaP: Proximal Occupancy Map Prediction for Efficient Indoor Robot Navigation

Figure 2 for ProxMaP: Proximal Occupancy Map Prediction for Efficient Indoor Robot Navigation

Figure 3 for ProxMaP: Proximal Occupancy Map Prediction for Efficient Indoor Robot Navigation

Figure 4 for ProxMaP: Proximal Occupancy Map Prediction for Efficient Indoor Robot Navigation

In a typical path planning pipeline for a ground robot, we build a map (e.g., an occupancy grid) of the environment as the robot moves around. While navigating indoors, a ground robot's knowledge about the environment may be limited due to occlusions. Therefore, the map will have many as-yet-unknown regions that may need to be avoided by a conservative planner. Instead, if a robot is able to correctly predict what its surroundings and occluded regions look like, the robot may be more efficient in navigation. In this work, we focus on predicting occupancy within the reachable distance of the robot to enable faster navigation and present a self-supervised proximity occupancy map prediction method, named ProxMaP. We show that ProxMaP generalizes well across realistic and real domains, and improves the robot navigation efficiency in simulation by \textbf{$12.40\%$} against the traditional navigation method. We share our findings on our project webpage (see https://raaslab.org/projects/ProxMaP ).

* This is an incremental work over an existing arxiv submission of the author. It will be re-uploaded as a version of that work

Via

Access Paper or Ask Questions

Interpretable Deep Reinforcement Learning for Green Security Games with Real-Time Information

Nov 09, 2022
Vishnu Dutt Sharma, John P. Dickerson, Pratap Tokekar

Figure 1 for Interpretable Deep Reinforcement Learning for Green Security Games with Real-Time Information

Figure 2 for Interpretable Deep Reinforcement Learning for Green Security Games with Real-Time Information

Figure 3 for Interpretable Deep Reinforcement Learning for Green Security Games with Real-Time Information

Figure 4 for Interpretable Deep Reinforcement Learning for Green Security Games with Real-Time Information

Green Security Games with real-time information (GSG-I) add the real-time information about the agents' movement to the typical GSG formulation. Prior works on GSG-I have used deep reinforcement learning (DRL) to learn the best policy for the agent in such an environment without any need to store the huge number of state representations for GSG-I. However, the decision-making process of DRL methods is largely opaque, which results in a lack of trust in their predictions. To tackle this issue, we present an interpretable DRL method for GSG-I that generates visualization to explain the decisions taken by the DRL algorithm. We also show that this approach performs better and works well with a simpler training regimen compared to the existing method.

Via

Access Paper or Ask Questions

D2CoPlan: A Differentiable Decentralized Planner for Multi-Robot Coverage

Sep 19, 2022
Vishnu Dutt Sharma, Lifeng Zhou, Pratap Tokekar

Figure 1 for D2CoPlan: A Differentiable Decentralized Planner for Multi-Robot Coverage

Figure 2 for D2CoPlan: A Differentiable Decentralized Planner for Multi-Robot Coverage

Figure 3 for D2CoPlan: A Differentiable Decentralized Planner for Multi-Robot Coverage

Figure 4 for D2CoPlan: A Differentiable Decentralized Planner for Multi-Robot Coverage

Centralized approaches for multi-robot coverage planning problems suffer from the lack of scalability. Learning-based distributed algorithms provide a scalable avenue in addition to bringing data-oriented feature generation capabilities to the table, allowing integration with other learning-based approaches. To this end, we present a learning-based, differentiable distributed coverage planner (D2COPL A N) which scales efficiently in runtime and number of agents compared to the expert algorithm, and performs on par with the classical distributed algorithm. In addition, we show that D2COPlan can be seamlessly combined with other learning methods to learn end-to-end, resulting in a better solution than the individually trained modules, opening doors to further research for tasks that remain elusive with classical methods.

Via

Access Paper or Ask Questions

Occupancy Map Prediction for Improved Indoor Robot Navigation

Mar 08, 2022
Vishnu Dutt Sharma, Jingxi Chen, Abhinav Shrivastava, Pratap Tokekar

Figure 1 for Occupancy Map Prediction for Improved Indoor Robot Navigation

Figure 2 for Occupancy Map Prediction for Improved Indoor Robot Navigation

Figure 3 for Occupancy Map Prediction for Improved Indoor Robot Navigation

Figure 4 for Occupancy Map Prediction for Improved Indoor Robot Navigation

In the typical path planning pipeline for a ground robot, we build a map (e.g., an occupancy grid) of the environment as the robot moves around. While navigating indoors, a ground robot's knowledge about the environment may be limited by the occlusions in its surroundings. Therefore, the map will have many as-yet-unknown regions that may need to be avoided by a conservative planner. Instead, if a robot is able to correctly infer what its surroundings and occluded regions look like, the navigation can be further optimized. In this work, we propose an approach using pix2pix and UNet to infer the occupancy grid in unseen areas near the robot as an image-to-image translation task. Our approach simplifies the task of occupancy map prediction for the deep learning network and reduces the amount of data required compared to similar existing methods. We show that the predicted map improves the navigation time in simulations over the existing approaches.

Via

Access Paper or Ask Questions

Risk-Aware Path Planning for Ground Vehicles using Occluded Aerial Images

Apr 23, 2021
Vishnu Dutt Sharma, Pratap Tokekar

Figure 1 for Risk-Aware Path Planning for Ground Vehicles using Occluded Aerial Images

Figure 2 for Risk-Aware Path Planning for Ground Vehicles using Occluded Aerial Images

Figure 3 for Risk-Aware Path Planning for Ground Vehicles using Occluded Aerial Images

Figure 4 for Risk-Aware Path Planning for Ground Vehicles using Occluded Aerial Images

We consider scenarios where a ground vehicle plans its path using data gathered by an aerial vehicle. In the aerial images, navigable areas of the scene may be occluded due to obstacles. Naively planning paths using aerial images may result in longer paths as a conservative planner may try to avoid regions that are occluded. We propose a modular, deep learning-based framework that allows the robot to predict the existence of navigable areas in the occluded regions. Specifically, we use image inpainting methods to fill in parts of the areas that are potentially occluded, which can then be semantically segmented to determine navigability. We use supervised neural networks for both modules. However, these predictions may be incorrect. Therefore, we extract uncertainty in these predictions and use a risk-aware approach that takes these uncertainties into account for path planning. We compare modules in our approach with non-learning-based approaches to show the efficacy of the proposed framework through photo-realistic simulations. The modular pipeline allows further improvement in path planning and deployment in different settings.

Via

Access Paper or Ask Questions

Free as in Free Word Order: An Energy Based Model for Word Segmentation and Morphological Tagging in Sanskrit

Oct 25, 2018
Amrith Krishna, Bishal Santra, Sasi Prasanth Bandaru, Gaurav Sahu, Vishnu Dutt Sharma, Pavankumar Satuluri, Pawan Goyal

Figure 1 for Free as in Free Word Order: An Energy Based Model for Word Segmentation and Morphological Tagging in Sanskrit

Figure 2 for Free as in Free Word Order: An Energy Based Model for Word Segmentation and Morphological Tagging in Sanskrit

Figure 3 for Free as in Free Word Order: An Energy Based Model for Word Segmentation and Morphological Tagging in Sanskrit

Figure 4 for Free as in Free Word Order: An Energy Based Model for Word Segmentation and Morphological Tagging in Sanskrit

The configurational information in sentences of a free word order language such as Sanskrit is of limited use. Thus, the context of the entire sentence will be desirable even for basic processing tasks such as word segmentation. We propose a structured prediction framework that jointly solves the word segmentation and morphological tagging tasks in Sanskrit. We build an energy based model where we adopt approaches generally employed in graph based parsing techniques (McDonald et al., 2005a; Carreras, 2007). Our model outperforms the state of the art with an F-Score of 96.92 (percentage improvement of 7.06%) while using less than one-tenth of the task-specific training data. We find that the use of a graph based ap- proach instead of a traditional lattice-based sequential labelling approach leads to a percentage gain of 12.6% in F-Score for the segmentation task.

* version 2: Corrected typo in Table1, page7 | Accepted in EMNLP 2018. Supplementary material can be found at - http://cse.iitkgp.ac.in/~amrithk/1080_supp.pdf

Via

Access Paper or Ask Questions

Building a Word Segmenter for Sanskrit Overnight

Feb 17, 2018
Vikas Reddy, Amrith Krishna, Vishnu Dutt Sharma, Prateek Gupta, Vineeth M R, Pawan Goyal

Figure 1 for Building a Word Segmenter for Sanskrit Overnight

Figure 2 for Building a Word Segmenter for Sanskrit Overnight

Figure 3 for Building a Word Segmenter for Sanskrit Overnight

There is an abundance of digitised texts available in Sanskrit. However, the word segmentation task in such texts are challenging due to the issue of 'Sandhi'. In Sandhi, words in a sentence often fuse together to form a single chunk of text, where the word delimiter vanishes and sounds at the word boundaries undergo transformations, which is also reflected in the written text. Here, we propose an approach that uses a deep sequence to sequence (seq2seq) model that takes only the sandhied string as the input and predicts the unsandhied string. The state of the art models are linguistically involved and have external dependencies for the lexical and morphological analysis of the input. Our model can be trained "overnight" and be used for production. In spite of the knowledge lean approach, our system preforms better than the current state of the art by gaining a percentage increase of 16.79 % than the current state of the art.

* The work is accepted at LREC 2018, Miyazaki, Japan

Via

Access Paper or Ask Questions

DeepVO: A Deep Learning approach for Monocular Visual Odometry

Nov 18, 2016
Vikram Mohanty, Shubh Agrawal, Shaswat Datta, Arna Ghosh, Vishnu Dutt Sharma, Debashish Chakravarty

Figure 1 for DeepVO: A Deep Learning approach for Monocular Visual Odometry

Figure 2 for DeepVO: A Deep Learning approach for Monocular Visual Odometry

Figure 3 for DeepVO: A Deep Learning approach for Monocular Visual Odometry

Figure 4 for DeepVO: A Deep Learning approach for Monocular Visual Odometry

Deep Learning based techniques have been adopted with precision to solve a lot of standard computer vision problems, some of which are image classification, object detection and segmentation. Despite the widespread success of these approaches, they have not yet been exploited largely for solving the standard perception related problems encountered in autonomous navigation such as Visual Odometry (VO), Structure from Motion (SfM) and Simultaneous Localization and Mapping (SLAM). This paper analyzes the problem of Monocular Visual Odometry using a Deep Learning-based framework, instead of the regular 'feature detection and tracking' pipeline approaches. Several experiments were performed to understand the influence of a known/unknown environment, a conventional trackable feature and pre-trained activations tuned for object classification on the network's ability to accurately estimate the motion trajectory of the camera (or the vehicle). Based on these observations, we propose a Convolutional Neural Network architecture, best suited for estimating the object's pose under known environment conditions, and displays promising results when it comes to inferring the actual scale using just a single camera in real-time.

Via

Access Paper or Ask Questions