Picture for Muning Wen

Muning Wen

Reinforcing Language Agents via Policy Optimization with Action Decomposition

Add code
May 23, 2024
Viaarxiv icon

TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision

Add code
Mar 10, 2024
Figure 1 for TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision
Figure 2 for TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision
Figure 3 for TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision
Figure 4 for TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision
Viaarxiv icon

Entropy-Regularized Token-Level Policy Optimization for Large Language Models

Add code
Feb 09, 2024
Viaarxiv icon

Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training

Add code
Sep 29, 2023
Figure 1 for Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training
Figure 2 for Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training
Figure 3 for Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training
Figure 4 for Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training
Viaarxiv icon

Large Sequence Models for Sequential Decision-Making: A Survey

Jun 24, 2023
Figure 1 for Large Sequence Models for Sequential Decision-Making: A Survey
Figure 2 for Large Sequence Models for Sequential Decision-Making: A Survey
Figure 3 for Large Sequence Models for Sequential Decision-Making: A Survey
Figure 4 for Large Sequence Models for Sequential Decision-Making: A Survey
Viaarxiv icon

Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

Add code
May 30, 2022
Figure 1 for Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Figure 2 for Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Figure 3 for Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Figure 4 for Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Viaarxiv icon

Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks

Add code
Dec 20, 2021
Figure 1 for Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks
Figure 2 for Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks
Figure 3 for Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks
Figure 4 for Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks
Viaarxiv icon

Settling the Variance of Multi-Agent Policy Gradients

Add code
Aug 20, 2021
Figure 1 for Settling the Variance of Multi-Agent Policy Gradients
Figure 2 for Settling the Variance of Multi-Agent Policy Gradients
Figure 3 for Settling the Variance of Multi-Agent Policy Gradients
Viaarxiv icon

MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning

Add code
Jun 05, 2021
Figure 1 for MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning
Figure 2 for MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning
Figure 3 for MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning
Figure 4 for MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning
Viaarxiv icon