Abstract
Reinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective. What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the learner about the learner's predictions. Further, the predictions may have long term effects through influencing the future state of the controlled system. Thus, time plays a special role. The goal in reinforcement learning is to develop efficient learning algorithms, as well as to understand the algorithms' merits and limitations. Reinforcement learning is of great interest because of the large number of practical applications that it can be used to address, ranging from problems in artificial intelligence to operations research or control engineering. In this book, we focus on those algorithms of reinforcement learning that build on the powerful theory of dynamic programming. We give a fairly comprehensive catalog of learning problems, describe the core ideas, note a large number of state of the art algorithms, followed by the discussion of their theoretical properties and limitations. Table of Contents: Markov Decision Processes / Value Prediction Problems / Control / For Further Exploration.
Original language | English |
---|---|
Title of host publication | Algorithms for Reinforcement Learning |
Editors | Ronald J. Brachman, Thomas Dietterich |
Pages | 1-89 |
Number of pages | 89 |
DOIs | |
State | Published - 2010 |
Externally published | Yes |
Publication series
Name | Synthesis Lectures on Artificial Intelligence and Machine Learning |
---|---|
Volume | 9 |
ISSN (Print) | 1939-4608 |
ISSN (Electronic) | 1939-4616 |
Keywords
- Markov Decision Processes
- Monte-Carlo methods
- PAC-learning
- Q-learning
- active learning
- actor-critic methods
- bias-variance tradeoff
- function approximation
- least-squares methods
- natural gradient
- online learning
- overfitting
- planning
- policy gradient
- reinforcement learning
- simulation
- simulation optimization
- stochastic approximation
- stochastic gradient methods
- temporal difference learning
- two-timescale stochastic approximation
ASJC Scopus subject areas
- Artificial Intelligence
Access to Document
Other files and links
Fingerprint
Dive into the research topics of 'Algorithms for reinforcement learning'. Together they form a unique fingerprint.
View full fingerprint
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver
Szepesvári, C. (2010). Algorithms for reinforcement learning. In R. J. Brachman, & T. Dietterich (Eds.), Algorithms for Reinforcement Learning (pp. 1-89). (Synthesis Lectures on Artificial Intelligence and Machine Learning; Vol. 9). https://doi.org/10.2200/S00268ED1V01Y201005AIM009
Szepesvári, Csaba. / Algorithms for reinforcement learning. Algorithms for Reinforcement Learning. editor / Ronald J. Brachman ; Thomas Dietterich. 2010. pp. 1-89 (Synthesis Lectures on Artificial Intelligence and Machine Learning).
@inproceedings{f8629d3882e842c1b06ce521d78dd410,
title = "Algorithms for reinforcement learning",
abstract = "Reinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective. What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the learner about the learner's predictions. Further, the predictions may have long term effects through influencing the future state of the controlled system. Thus, time plays a special role. The goal in reinforcement learning is to develop efficient learning algorithms, as well as to understand the algorithms' merits and limitations. Reinforcement learning is of great interest because of the large number of practical applications that it can be used to address, ranging from problems in artificial intelligence to operations research or control engineering. In this book, we focus on those algorithms of reinforcement learning that build on the powerful theory of dynamic programming. We give a fairly comprehensive catalog of learning problems, describe the core ideas, note a large number of state of the art algorithms, followed by the discussion of their theoretical properties and limitations. Table of Contents: Markov Decision Processes / Value Prediction Problems / Control / For Further Exploration.",
keywords = "Markov Decision Processes, Monte-Carlo methods, PAC-learning, Q-learning, active learning, actor-critic methods, bias-variance tradeoff, function approximation, least-squares methods, natural gradient, online learning, overfitting, planning, policy gradient, reinforcement learning, simulation, simulation optimization, stochastic approximation, stochastic gradient methods, temporal difference learning, two-timescale stochastic approximation",
author = "Csaba Szepesv{\'a}ri",
year = "2010",
doi = "10.2200/S00268ED1V01Y201005AIM009",
language = "אנגלית",
isbn = "9781608454921",
series = "Synthesis Lectures on Artificial Intelligence and Machine Learning",
pages = "1--89",
editor = "Brachman, {Ronald J.} and Thomas Dietterich",
booktitle = "Algorithms for Reinforcement Learning",
}
Szepesvári, C 2010, Algorithms for reinforcement learning. in RJ Brachman & T Dietterich (eds), Algorithms for Reinforcement Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 9, pp. 1-89. https://doi.org/10.2200/S00268ED1V01Y201005AIM009
Algorithms for reinforcement learning. / Szepesvári, Csaba.
Algorithms for Reinforcement Learning. ed. / Ronald J. Brachman; Thomas Dietterich. 2010. p. 1-89 (Synthesis Lectures on Artificial Intelligence and Machine Learning; Vol. 9).
Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
TY - GEN
T1 - Algorithms for reinforcement learning
AU - Szepesvári, Csaba
PY - 2010
Y1 - 2010
N2 - Reinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective. What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the learner about the learner's predictions. Further, the predictions may have long term effects through influencing the future state of the controlled system. Thus, time plays a special role. The goal in reinforcement learning is to develop efficient learning algorithms, as well as to understand the algorithms' merits and limitations. Reinforcement learning is of great interest because of the large number of practical applications that it can be used to address, ranging from problems in artificial intelligence to operations research or control engineering. In this book, we focus on those algorithms of reinforcement learning that build on the powerful theory of dynamic programming. We give a fairly comprehensive catalog of learning problems, describe the core ideas, note a large number of state of the art algorithms, followed by the discussion of their theoretical properties and limitations. Table of Contents: Markov Decision Processes / Value Prediction Problems / Control / For Further Exploration.
AB - Reinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective. What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the learner about the learner's predictions. Further, the predictions may have long term effects through influencing the future state of the controlled system. Thus, time plays a special role. The goal in reinforcement learning is to develop efficient learning algorithms, as well as to understand the algorithms' merits and limitations. Reinforcement learning is of great interest because of the large number of practical applications that it can be used to address, ranging from problems in artificial intelligence to operations research or control engineering. In this book, we focus on those algorithms of reinforcement learning that build on the powerful theory of dynamic programming. We give a fairly comprehensive catalog of learning problems, describe the core ideas, note a large number of state of the art algorithms, followed by the discussion of their theoretical properties and limitations. Table of Contents: Markov Decision Processes / Value Prediction Problems / Control / For Further Exploration.
KW - Markov Decision Processes
KW - Monte-Carlo methods
KW - PAC-learning
KW - Q-learning
KW - active learning
KW - actor-critic methods
KW - bias-variance tradeoff
KW - function approximation
KW - least-squares methods
KW - natural gradient
KW - online learning
KW - overfitting
KW - planning
KW - policy gradient
KW - reinforcement learning
KW - simulation
KW - simulation optimization
KW - stochastic approximation
KW - stochastic gradient methods
KW - temporal difference learning
KW - two-timescale stochastic approximation
UR - http://www.scopus.com/inward/record.url?scp=77955790905&partnerID=8YFLogxK
U2 - 10.2200/S00268ED1V01Y201005AIM009
DO - 10.2200/S00268ED1V01Y201005AIM009
M3 - פרסום בספר כנס
AN - SCOPUS:77955790905
SN - 9781608454921
T3 - Synthesis Lectures on Artificial Intelligence and Machine Learning
SP - 1
EP - 89
BT - Algorithms for Reinforcement Learning
A2 - Brachman, Ronald J.
A2 - Dietterich, Thomas
ER -
Szepesvári C. Algorithms for reinforcement learning. In Brachman RJ, Dietterich T, editors, Algorithms for Reinforcement Learning. 2010. p. 1-89. (Synthesis Lectures on Artificial Intelligence and Machine Learning). https://doi.org/10.2200/S00268ED1V01Y201005AIM009
FAQs
Is the most widely used reinforcement learning algorithm? ›
What are some of the most used Reinforcement Learning algorithms? Q-learning and SARSA (State-Action-Reward-State-Action) are two commonly used model-free RL algorithms.
Which framework is best for reinforcement learning? ›...
Reinforcement Learning Frameworks
- OpenAI Gym.
- Google Dopamine.
- RLLib.
- Keras-RL.
- TRFL.
- Tensorforce.
- Facebook Horizon.
- Nervana Systems Coach.
Reinforcement learning is an area of Machine Learning.