You are here: University of Vienna PHAIDRA Detail o:2069739
Title (deu)
Error bounds for mean-payoff Markov decision processes
Speaker / Lecturer
Roberto Cominetti
U Adolfo Ibanez, Santiago de Chile
Description (deu)

We discuss the use of optimal transport techniques to derive finite-time error bounds for reinforcement learning in mean-payoff Markov decision processes. The results are obtained as a special case of stochastic Krasnoselski—Mann fixed point iterations for nonexpansive maps. We present sufficient conditions on the stochastic noise and stepsizes that guarantee almost sure convergence of the iterates towards a fixed point, as well as non-asymptotic error bounds and convergence rates. Our main results concern the case of a martingale difference noise with variances that can possibly grow unbounded. We also analyze the case of uniformly bounded variances, and how they apply for Stochastic Gradient Descent in convex optimization.

Keywords (deu)
One World Optimization Seminar
Subject (eng)
ÖFOS 2012 -- 101 -- Mathematics
Type (eng)
Language
[eng]
Persistent identifier
https://phaidra.univie.ac.at/o:2069739
Date created
2024-06-03
Place of creation (eng)
ESI
Duration
28 minutes 53 seconds
Content
Details
Object type
Video
Format
video/mp4
Created
07.06.2024 11:37:33
Metadata