Multi-Armed Bandits (Sutton & Barto, Ch. 2)

This is a marimo notebook walking through Chapter 2 of Sutton & Barto — the multi-armed bandits problem, epsilon-greedy, UCB, and gradient bandits.

Open the interactive notebook →