1 datasets tagged "network"
212 iterations of AlphaZero-style self-play training tracking policy/value loss, MCTS agreement, game outcomes, and value calibration.