Category: Sequential Decision Problems
-
Retirement, Stopping times and Bandits: The Gittins index
“A colleague of high repute asked an equally well-known colleague:— What would you say if you were told that the multi-armed bandit problem had been solved?— Sir, the multi-armed bandit problem is not of such a nature that it can be solved.” Peter Whittle In our busy daily life, while multi-tasking we are constantly faced…
-
Is Reinforcement Learning all you need?
When attacking a new problem, the algorithm designer typically follows 3 main steps: When reporting her/his work, the algorithm designer will proudly focus on step 3), briefly mention 2) and likely sweep 1) under the carpet. Yet, skimming alternatives off is a crucial step, that inevitably impacts (positively or negatively) months of hard work on…