Lavoisier S.A.S.
14 rue de Provigny
94236 Cachan cedex
FRANCE

Heures d'ouverture 08h30-12h30/13h30-17h30
Tél.: +33 (0)1 47 40 67 00
Fax: +33 (0)1 47 40 67 02


Url canonique : www.lavoisier.fr/livre/mathematiques/approximate-dynamic-programming-solving-the-curses-of-dimensionality/powell/descriptif_1342341
Url courte ou permalien : www.lavoisier.fr/livre/notice.asp?ouvrage=1342341

Approximate dynamic programming : solving the curses of dimensionality

Langue : Anglais

Auteur :

Couverture de l’ouvrage Approximate dynamic programming : solving the curses of dimensionality
This book represents an up-to-date, complete and accessible introduction to approximate dynamic programming. It is motivated primarily by problems that arise in operations research and engineering. Much of the emphasis in the book is placed on how to model complex problems and design practical, scalable algorithms for solving them. Example problems generally involve the management of physical, financial or informational resources in an industrial setting.
1. The challenges of dynamic programming. A dynamic programming example: a shortest path problem. The three curses of dimensionality. Some real applications. Problem classes. The many dialects of dynamic programming. 2. Some illustrative models. Deterministic problems. Stochastic problems. Information acquisition problems. A simple modeling framework for dynamic programs. 3. Introduction to Markov decision processes. The optimality equations. Finite horizon problems. Infinite horizon problems. Value iteration. Policy iteration. Hybrid valuepolicy iteration. The linear programming method for dynamic programs. Monotone policies. Why does it work? 4. Introduction to approximate dynamic programming. The three curses of dimensionality (revisited). The basic idea. Sampling random variables. ADP using the postdecision state variable. Lowdimensional representations of value functions. So just what is approximate dynamic programming? Experimental issues. Dynamic programming with missing or incomplete models. Relationship to reinforcement learning. But does it work? Modeling dynamic programs. Notational style. Modeling time. Modeling resources. The states of our system. Modeling decisions. The exogenous information process. The transition function. The contribution function. The objective function. A measuretheoretic view of information. 6. Stochastic approximation methods. A stochastic gradient algorithm. Some stepsize recipes. Stochastic stepsizes. Computing bias and variance. Optimal stepsizes. Some experimental comparisons of stepsize formulas. Convergence. Why does it work? 7. Approximating value functions. Approximation using aggregation. Approximation methods using regression models. Recursive methods for regression models. Neural networks. Batch processes. Why does it work? 8. ADP for finite horizon problems. Strategies for finite horizon problems. Qlearning. Temporal difference learning. Policy iteration. Monte Carlo value and policy iteration.The actorcritic paradigm. Bias in value function estimation. State sampling strategies. Starting and stopping. A taxonomy of approximate dynamic programming strategies. Why does it work? 9. Infinite horizon problems. From finite to infinite horizon. Algorithmic strategies. Stepsizes for infinite horizon problems. Error measures. Direct ADP for online applications. Finite horizon models for steady state applications. Why does it work? 10. Exploration vs. exploitation. A learning exercise: the nomadic trucker. Learning strategies. A simple information acquisition problem. Gittins indices and the information acquisition problem. Variations. The knowledge gradient algorithm. Information acquisition in dynamic programming. 11. Value function approximations for special functions. Value functions versus gradients. Linear approximations. Piecewise linear approximations. The SHAPE algorithm. Regression methods. Cutting planes. Why does it work? 12. Dynamic resource allocation. An asset acquisition problem. The blood management problem. A portfolio optimization problem. A general resource allocation problem. A fleet management problem. A driver management problem. Implementation challenges. Will ADP work for your problem? Designing an ADP algorithm for complex problems. Debugging an ADP algorithm.Convergence issues. Modeling your problem. Online vs. offline models. If it works, patent it!

Date de parution :

Ouvrage de 480 p.

Sous réserve de disponibilité chez l'éditeur.

Prix indicatif 115,69 €

Ajouter au panier

Thèmes d’Approximate dynamic programming : solving the curses of... :