1 datasets tagged "implementation"
171 training iterations of a Catan RL implementation. Tracks policy and value loss convergence, game length evolution, and self-play performance metri...