Alert button
Picture for Tom Zahavy

Tom Zahavy

Alert button

Diversifying AI: Towards Creative Chess with AlphaZero

Add code
Bookmark button
Alert button
Aug 29, 2023
Tom Zahavy, Vivek Veeriah, Shaobo Hou, Kevin Waugh, Matthew Lai, Edouard Leurent, Nenad Tomasev, Lisa Schut, Demis Hassabis, Satinder Singh

Figure 1 for Diversifying AI: Towards Creative Chess with AlphaZero
Figure 2 for Diversifying AI: Towards Creative Chess with AlphaZero
Figure 3 for Diversifying AI: Towards Creative Chess with AlphaZero
Figure 4 for Diversifying AI: Towards Creative Chess with AlphaZero
Viaarxiv icon

APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT

Add code
Bookmark button
Alert button
Aug 24, 2023
Hadar Schreiber Galler, Tom Zahavy, Guillaume Desjardins, Alon Cohen

Figure 1 for APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT
Figure 2 for APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT
Figure 3 for APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT
Figure 4 for APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT
Viaarxiv icon

Optimism and Adaptivity in Policy Optimization

Add code
Bookmark button
Alert button
Jun 18, 2023
Veronica Chelu, Tom Zahavy, Arthur Guez, Doina Precup, Sebastian Flennerhag

Figure 1 for Optimism and Adaptivity in Policy Optimization
Figure 2 for Optimism and Adaptivity in Policy Optimization
Figure 3 for Optimism and Adaptivity in Policy Optimization
Figure 4 for Optimism and Adaptivity in Policy Optimization
Viaarxiv icon

Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization

Add code
Bookmark button
Alert button
Apr 08, 2023
Robert Tjarko Lange, Tom Schaul, Yutian Chen, Chris Lu, Tom Zahavy, Valentin Dalibard, Sebastian Flennerhag

Figure 1 for Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization
Figure 2 for Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization
Figure 3 for Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization
Figure 4 for Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization
Viaarxiv icon

ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs

Add code
Bookmark button
Alert button
Feb 02, 2023
Ted Moskovitz, Brendan O'Donoghue, Vivek Veeriah, Sebastian Flennerhag, Satinder Singh, Tom Zahavy

Figure 1 for ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs
Figure 2 for ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs
Figure 3 for ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs
Figure 4 for ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs
Viaarxiv icon

Optimistic Meta-Gradients

Add code
Bookmark button
Alert button
Jan 09, 2023
Sebastian Flennerhag, Tom Zahavy, Brendan O'Donoghue, Hado van Hasselt, András György, Satinder Singh

Figure 1 for Optimistic Meta-Gradients
Figure 2 for Optimistic Meta-Gradients
Figure 3 for Optimistic Meta-Gradients
Figure 4 for Optimistic Meta-Gradients
Viaarxiv icon

POMRL: No-Regret Learning-to-Plan with Increasing Horizons

Add code
Bookmark button
Alert button
Dec 30, 2022
Khimya Khetarpal, Claire Vernade, Brendan O'Donoghue, Satinder Singh, Tom Zahavy

Figure 1 for POMRL: No-Regret Learning-to-Plan with Increasing Horizons
Figure 2 for POMRL: No-Regret Learning-to-Plan with Increasing Horizons
Figure 3 for POMRL: No-Regret Learning-to-Plan with Increasing Horizons
Figure 4 for POMRL: No-Regret Learning-to-Plan with Increasing Horizons
Viaarxiv icon

Discovering Evolution Strategies via Meta-Black-Box Optimization

Add code
Bookmark button
Alert button
Nov 25, 2022
Robert Tjarko Lange, Tom Schaul, Yutian Chen, Tom Zahavy, Valentin Dallibard, Chris Lu, Satinder Singh, Sebastian Flennerhag

Figure 1 for Discovering Evolution Strategies via Meta-Black-Box Optimization
Figure 2 for Discovering Evolution Strategies via Meta-Black-Box Optimization
Figure 3 for Discovering Evolution Strategies via Meta-Black-Box Optimization
Figure 4 for Discovering Evolution Strategies via Meta-Black-Box Optimization
Viaarxiv icon

Meta-Gradients in Non-Stationary Environments

Add code
Bookmark button
Alert button
Sep 13, 2022
Jelena Luketina, Sebastian Flennerhag, Yannick Schroecker, David Abel, Tom Zahavy, Satinder Singh

Figure 1 for Meta-Gradients in Non-Stationary Environments
Figure 2 for Meta-Gradients in Non-Stationary Environments
Figure 3 for Meta-Gradients in Non-Stationary Environments
Figure 4 for Meta-Gradients in Non-Stationary Environments
Viaarxiv icon

Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality

Add code
Bookmark button
Alert button
May 26, 2022
Tom Zahavy, Yannick Schroecker, Feryal Behbahani, Kate Baumli, Sebastian Flennerhag, Shaobo Hou, Satinder Singh

Figure 1 for Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Figure 2 for Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Figure 3 for Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Figure 4 for Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Viaarxiv icon