1 datasets tagged "212"
Policy and value loss, game outcomes, and MCTS search metrics across 212 training iterations of a reinforcement learning agent.