Alert button
Picture for Mudit Verma

Mudit Verma

Alert button

Hindsight PRIORs for Reward Learning from Human Preferences

Add code
Bookmark button
Alert button
Apr 12, 2024
Mudit Verma, Katherine Metcalf

Viaarxiv icon

LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks

Add code
Bookmark button
Alert button
Feb 06, 2024
Subbarao Kambhampati, Karthik Valmeekam, Lin Guan, Kaya Stechly, Mudit Verma, Siddhant Bhambri, Lucas Saldyt, Anil Murthy

Viaarxiv icon

Theory of Mind abilities of Large Language Models in Human-Robot Interaction : An Illusion?

Add code
Bookmark button
Alert button
Jan 17, 2024
Mudit Verma, Siddhant Bhambri, Subbarao Kambhampati

Viaarxiv icon

Benchmarking Multi-Agent Preference-based Reinforcement Learning for Human-AI Teaming

Add code
Bookmark button
Alert button
Dec 21, 2023
Siddhant Bhambri, Mudit Verma, Anil Murthy, Subbarao Kambhampati

Viaarxiv icon

Methods and Mechanisms for Interactive Novelty Handling in Adversarial Environments

Add code
Bookmark button
Alert button
Mar 06, 2023
Tung Thai, Ming Shen, Mayank Garg, Ayush Kalani, Nakul Vaidya, Utkarsh Soni, Mudit Verma, Sriram Gopalakrishnan, Neeraj Varshney, Chitta Baral, Subbarao Kambhampati, Jivko Sinapov, Matthias Scheutz

Figure 1 for Methods and Mechanisms for Interactive Novelty Handling in Adversarial Environments
Figure 2 for Methods and Mechanisms for Interactive Novelty Handling in Adversarial Environments
Figure 3 for Methods and Mechanisms for Interactive Novelty Handling in Adversarial Environments
Figure 4 for Methods and Mechanisms for Interactive Novelty Handling in Adversarial Environments
Viaarxiv icon

Exploiting Unlabeled Data for Feedback Efficient Human Preference based Reinforcement Learning

Add code
Bookmark button
Alert button
Feb 17, 2023
Mudit Verma, Siddhant Bhambri, Subbarao Kambhampati

Figure 1 for Exploiting Unlabeled Data for Feedback Efficient Human Preference based Reinforcement Learning
Figure 2 for Exploiting Unlabeled Data for Feedback Efficient Human Preference based Reinforcement Learning
Viaarxiv icon

A State Augmentation based approach to Reinforcement Learning from Human Preferences

Add code
Bookmark button
Alert button
Feb 17, 2023
Mudit Verma, Subbarao Kambhampati

Figure 1 for A State Augmentation based approach to Reinforcement Learning from Human Preferences
Figure 2 for A State Augmentation based approach to Reinforcement Learning from Human Preferences
Figure 3 for A State Augmentation based approach to Reinforcement Learning from Human Preferences
Viaarxiv icon

Data Driven Reward Initialization for Preference based Reinforcement Learning

Add code
Bookmark button
Alert button
Feb 17, 2023
Mudit Verma, Subbarao Kambhampati

Figure 1 for Data Driven Reward Initialization for Preference based Reinforcement Learning
Figure 2 for Data Driven Reward Initialization for Preference based Reinforcement Learning
Figure 3 for Data Driven Reward Initialization for Preference based Reinforcement Learning
Figure 4 for Data Driven Reward Initialization for Preference based Reinforcement Learning
Viaarxiv icon

Towards customizable reinforcement learning agents: Enabling preference specification through online vocabulary expansion

Add code
Bookmark button
Alert button
Oct 27, 2022
Utkarsh Soni, Sarath Sreedharan, Mudit Verma, Lin Guan, Matthew Marquez, Subbarao Kambhampati

Figure 1 for Towards customizable reinforcement learning agents: Enabling preference specification through online vocabulary expansion
Figure 2 for Towards customizable reinforcement learning agents: Enabling preference specification through online vocabulary expansion
Figure 3 for Towards customizable reinforcement learning agents: Enabling preference specification through online vocabulary expansion
Figure 4 for Towards customizable reinforcement learning agents: Enabling preference specification through online vocabulary expansion
Viaarxiv icon

Symbol Guided Hindsight Priors for Reward Learning from Human Preferences

Add code
Bookmark button
Alert button
Oct 19, 2022
Mudit Verma, Katherine Metcalf

Figure 1 for Symbol Guided Hindsight Priors for Reward Learning from Human Preferences
Figure 2 for Symbol Guided Hindsight Priors for Reward Learning from Human Preferences
Viaarxiv icon