Toggle Main Menu Toggle Search

Open Access padlockePrints

Evaluating policies for generalized bandits via a notion of duality

Lookup NU author(s): Professor Kevin Glazebrook

Downloads

Full text for this publication is not currently held within this repository. Alternative links are provided below where available.


Abstract

Nash's generalization of Gittins' classic index result to so-called generalized bandit problems (GBPs) in which returns are dependent on the states of all arms (not only the one which is pulled) has proved important for applications. The index theory for special cases of this model in which all indices are positive is straightforward. However, this is not a natural restriction in practice. An earlier proposal for the general case did not yield satisfactory index-based suboptimality bounds for policies - a central feature of classical Gittins index theory. We develop such bounds via a notion of duality for GBPs which is of independent interest. The index which emerges naturally from this analysis is the reciprocal of the one proposed by Nash.


Publication metadata

Author(s): Crosbie JH, Glazebrook KD

Publication type: Article

Publication status: Published

Journal: Journal of Applied Probability

Year: 2000

Volume: 37

Issue: 2

Pages: 540-546

ISSN (print): 0021-9002

ISSN (electronic): 1475-6072

Publisher: Applied Probability Trust

URL: http://www.jstor.org/stable/3215728


Share