Anc-VI Sets a New Standard for Reinforcement Learning Optimization

This content originally appeared on HackerNoon and was authored by Anchoring

:::info Authors:

(1) Jongmin Lee, Department of Mathematical Science, Seoul National University;

(2) Ernest K. Ryu, Department of Mathematical Science, Seoul National University and Interdisciplinary Program in Artificial Intelligence, Seoul National University.

:::

Abstract and 1 Introduction

1.1 Notations and preliminaries

1.2 Prior works

2 Anchored Value Iteration

2.1 Accelerated rate for Bellman consistency operator

2.2 Accelerated rate for Bellman optimality opera

3 Convergence when y=1

4 Complexity lower bound

5 Approximate Anchored Value Iteration

6 Gauss–Seidel Anchored Value Iteration

7 Conclusion, Acknowledgments and Disclosure of Funding and References

A Preliminaries

B Omitted proofs in Section 2

C Omitted proofs in Section 3

D Omitted proofs in Section 4

E Omitted proofs in Section 5

F Omitted proofs in Section 6

G Broader Impacts

H Limitations

4 Complexity lower bound

We now present a complexity lower bound establishing optimality of Anc-VI.

\ The so-called “span condition” of Theorem 5 is arguably very natural and is satisfied by standard VI and Anc-VI. The span condition is commonly used in the construction of complexity lower bounds on first-order optimization methods [13, 14, 23, 25, 59, 65] and has been used in the prior state-ofthe-art lower bound for standard VI [37, Theorem 3]. However, designing an algorithm that breaks the lower bound of Theorem 5 by violating the span condition remains a possibility. In optimization theory, there is precedence of lower bounds being broken by violating seemingly natural and minute conditions [35, 40, 98].

:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

This content originally appeared on HackerNoon and was authored by Anchoring

Print Share Comment Cite Upload Translate Updates

APA

Anchoring | Sciencx (2025-01-14T22:56:39+00:00) Anc-VI Sets a New Standard for Reinforcement Learning Optimization. Retrieved from https://www.scien.cx/2025/01/14/anc-vi-sets-a-new-standard-for-reinforcement-learning-optimization/

MLA

" » Anc-VI Sets a New Standard for Reinforcement Learning Optimization." Anchoring | Sciencx - Tuesday January 14, 2025, https://www.scien.cx/2025/01/14/anc-vi-sets-a-new-standard-for-reinforcement-learning-optimization/

HARVARD

Anchoring | Sciencx Tuesday January 14, 2025 » Anc-VI Sets a New Standard for Reinforcement Learning Optimization., viewed ,<https://www.scien.cx/2025/01/14/anc-vi-sets-a-new-standard-for-reinforcement-learning-optimization/>

VANCOUVER

Anchoring | Sciencx - » Anc-VI Sets a New Standard for Reinforcement Learning Optimization. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/01/14/anc-vi-sets-a-new-standard-for-reinforcement-learning-optimization/

CHICAGO

" » Anc-VI Sets a New Standard for Reinforcement Learning Optimization." Anchoring | Sciencx - Accessed . https://www.scien.cx/2025/01/14/anc-vi-sets-a-new-standard-for-reinforcement-learning-optimization/

IEEE

" » Anc-VI Sets a New Standard for Reinforcement Learning Optimization." Anchoring | Sciencx [Online]. Available: https://www.scien.cx/2025/01/14/anc-vi-sets-a-new-standard-for-reinforcement-learning-optimization/. [Accessed: ]

rf:citation

» Anc-VI Sets a New Standard for Reinforcement Learning Optimization | Anchoring | Sciencx | https://www.scien.cx/2025/01/14/anc-vi-sets-a-new-standard-for-reinforcement-learning-optimization/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.

4 Complexity lower bound

Related Posts