style datasets
6 datasets tagged "style"
AlphaZero-Style RL Training Metrics (13 Iterations)
Policy and value network training logs tracking losses, game length, MCTS agreement, and value calibration across 13 self-play iterations.
AlphaZero-Style Game Agent Training Metrics (13 Iterations)
Policy and value loss, game outcomes, and MCTS statistics from a reinforcement learning agent training run over 13 self-play iterations.
AlphaZero-Style Training Run (177 Iterations)
Reinforcement learning training metrics tracking policy loss, value loss, game length, and MCTS agreement over 177 self-play iterations.
AlphaZero-Style Self-Play Training Metrics (177 Iterations)
Policy loss, value loss, game length, and MCTS agreement tracked over 177 self-play iterations of AlphaZero-style reinforcement learning.
AlphaZero-Style Training Metrics (177 Iterations)
Self-play reinforcement learning run tracking policy loss, value loss, game length, and MCTS agreement across 177 training iterations.
AlphaZero-Style Training Run: 171 Iterations of Self-Play
Policy and value network training metrics over 171 iterations, tracking loss convergence, game length, MCTS agreement, and value calibration.