000 01842nam a22005297a 4500
999 _c13885
_d13884
003 OSt
005 20231103135354.0
006 m|||||o||d| 00| 0
007 cr || auc||a|a
008 210422s2018 maua||||sb||| 001 0 eng d
040 _cQCPL
100 1 _91469
_aSutton, Richard S.
_eauthor
245 1 0 _aReinforcement learning
_b: an introduction
_c/ Richard S. Sutton and Andrew G. Barto
250 _aSecond edition
264 1 _aCambridge, Massachusetts :
_bMIT Press,
_c[2018]
300 _a1 online resource :
_billustrations
336 _2rdacontent
_atext
337 _2rdamedia
_acomputer
338 _2rdacarrier
_aonline resource
490 _aAdaptive computation and machine learning
504 _aIncludes bibliographical references and index.
505 0 _aReinforcement learning
505 0 _aPart I. Tabular solution methods
505 0 _aMulti-armed bandits
505 0 _aFinite Markov decision processes
505 0 _aDynamic programming
505 0 _aMonte Carlo methods
505 0 _aTemporal-difference learning
505 0 _an-step bootstrapping
505 0 _aPlanning and learning with tabular methods
505 0 _aPart II. Approximate solution methods
505 0 _aOn-policy prediction with approximation
505 0 _aOn-policy control with approximation
505 0 _aOff-policy methods with approximation
505 0 _aEligibility traces
505 0 _aPolicy gradient methods
505 0 _aPart III. Looking deeper
505 0 _aPsychology
505 0 _aNeuroscience
505 0 _aApplications and case studies
505 0 _aFrontiers
650 _aReinforcement learning
655 _aElectronic Books
700 1 _91470
_aBarto, Andrew G.
_eauthor
856 _amit.edu
_uhttps://www.andrew.cmu.edu/course/10-703/textbook/BartoSutton.pdf
942 _2ddc
_cEBOOK