AlphaZero-Style Training Run: 13 Iterations

iteration loss_policy_train loss_value_train loss_policy_val loss_value_val loss_soft_policy_train loss_soft_policy_val loss_aux_value_train loss_aux_value_val loss_aux_value_0_train loss_aux_value_0_val loss_aux_value_1_train loss_aux_value_1_val loss_aux_value_2_train loss_aux_value_2_val loss_aux_value_3_train loss_aux_value_3_val gradient_steps game_length_avg game_length_stddev game_length_min game_length_max game_wins game_losses game_draws policy_entropy_avg policy_max_prob_avg policy_entropy_high_branch_avg policy_max_prob_high_branch_avg policy_agreement_avg policy_agreement_high_branch_avg policy_surprise_avg value_z_avg value_q_avg value_z_stddev value_q_stddev value_correction_avg value_correction_high_branch_avg value_q_spread_avg value_q_spread_high_branch_avg value_error_early_avg value_error_mid_avg value_error_late_avg value_network_stddev lr q_weight mcts_sims replay_samples samples_iter time_selfplay_secs time_train_secs
1 1 3.864787 0.320826 3.18191 0.242259 3.646052 3.076387 0.170796 0.031286 0.061016 0.005071 0.041854 0.009963 0.716512 0.133743 0.0 0.0 35 354.264 67.721033 134 418 206 257 37 1.069399 0.552025 1.627311 0.355778 0.306516 0.152918 0.900903 0.117202 -0.043852 0.755335 0.117126 0.256406 0.220012 0.110637 0.128793 0.703323 0.707441 0.717725 0.043437 0.0005 0.028333 100 178444 178444 286.273461 77.393945
2 2 3.054179 0.214878 2.736785 0.18571 2.956313 2.728005 0.025733 0.01787 0.036853 0.028107 0.038285 0.029353 0.048863 0.028961 0.0 0.0 66 363.372 59.736836 132 420 235 235 30 1.107233 0.555301 1.653229 0.386097 0.212524 0.147646 1.226714 0.125226 0.066841 0.75366 0.302264 0.161567 0.139284 0.075105 0.073412 0.623991 0.541579 0.498636 0.311493 0.0005 0.056667 197 334478 156034 311.520347 122.90891
3 3 2.473027 0.319545 2.383122 0.304262 2.611523 2.547672 0.030155 0.017883 0.043951 0.021795 0.04879 0.029612 0.051183 0.033476 0.0 0.0 42 306.762 80.571219 125 456 246 242 12 0.914362 0.649239 1.223786 0.563321 0.343803 0.29213 1.169231 0.107064 0.078501 0.913341 0.397952 0.118311 0.109984 0.055504 0.064856 0.795905 0.72426 0.626296 0.413562 0.0005 0.085 293 214466 214466 565.548518 90.27873
4 4 2.306242 0.283898 2.178379 0.285746 2.373558 2.262347 0.023122 0.027903 0.033743 0.03843 0.035401 0.045739 0.040325 0.047388 0.0 0.0 92 290.198 101.386364 99 488 223 266 11 0.636061 0.749145 0.864953 0.690746 0.414222 0.362223 1.309917 0.052395 0.030916 0.88627 0.428252 0.157282 0.157085 0.05787 0.074392 0.807009 0.671657 0.560194 0.441084 0.0005 0.113333 390 468004 253538 776.247988 171.735797
5 5 2.120492 0.266878 2.084083 0.281335 2.201041 2.173777 0.022068 0.019783 0.029671 0.025418 0.034216 0.0314 0.040508 0.03751 0.0 0.0 125 245.736 82.902221 91 448 297 197 6 0.727392 0.718925 1.036619 0.622322 0.347107 0.255754 1.275226 0.092726 0.062839 0.952357 0.48553 0.178129 0.207145 0.073739 0.089232 0.855319 0.713809 0.555246 0.479771 0.0005 0.141667 487 635428 167424 624.092566 233.373341
6 6 2.056717 0.254081 2.004317 0.244936 2.128085 2.089925 0.025475 0.023007 0.035926 0.038931 0.040308 0.031487 0.044802 0.038463 0.0 0.0 159 224.934 85.427499 84 460 227 271 2 0.507362 0.801921 0.702605 0.743406 0.388656 0.307349 1.443071 0.064813 0.037782 0.960654 0.475798 0.198953 0.211323 0.075591 0.088419 0.876541 0.746356 0.578257 0.496498 0.0005 0.17 583 810439 175011 721.00008 297.277773
7 7 1.980565 0.240194 1.962871 0.247953 2.056309 2.038054 0.024335 0.023682 0.035475 0.028279 0.036753 0.039402 0.043138 0.045329 0.0 0.0 184 199.75 63.037604 71 438 253 243 4 0.709711 0.730818 1.050833 0.624401 0.331796 0.203324 1.272419 0.052391 0.052032 0.982562 0.453649 0.125269 0.11675 0.056044 0.065202 0.901564 0.776227 0.608777 0.460589 0.0005 0.198333 680 939431 128992 609.804331 345.188357
8 8 1.934722 0.224824 1.926557 0.237877 2.009667 1.996964 0.023222 0.030667 0.031103 0.045675 0.036661 0.050567 0.042819 0.049246 0.0 0.0 209 205.662 65.041493 92 434 297 201 2 0.669525 0.741777 1.023188 0.622877 0.319223 0.180888 1.501361 0.056121 0.060203 0.982131 0.472652 0.203874 0.193068 0.091324 0.104065 0.890773 0.743353 0.584211 0.457012 0.0005 0.226667 777 1065506 126075 655.579999 390.819171
9 9 1.905073 0.210875 1.928705 0.205652 1.952473 1.973959 0.022604 0.02032 0.028725 0.02477 0.03595 0.031809 0.042882 0.040527 0.0 0.0 198 191.228 71.703305 79 462 234 265 1 0.450463 0.823819 0.639126 0.766002 0.38914 0.288425 1.395133 0.044485 0.032008 0.976307 0.470015 0.16768 0.161381 0.074023 0.086891 0.903919 0.743153 0.570822 0.480495 0.0005 0.255 873 1011396 160356 953.552464 371.128067
10 10 1.885457 0.202162 1.930772 0.194228 1.936023 1.970428 0.023354 0.021064 0.028933 0.025361 0.037461 0.033421 0.044752 0.041666 0.0 0.0 174 183.046 61.119096 70 460 255 240 5 0.657052 0.752794 1.023239 0.635632 0.362461 0.230714 1.293476 0.062379 0.061959 0.981138 0.480829 0.122319 0.116329 0.05668 0.06547 0.893507 0.749269 0.558663 0.493849 0.0005 0.283333 970 888992 131134 815.300974 326.27391
11 11 1.845871 0.191741 1.825237 0.184761 1.896899 1.87675 0.02634 0.023318 0.03383 0.02774 0.042235 0.03806 0.049186 0.044817 0.0 0.0 172 190.236 63.771156 70 456 274 226 0 0.661495 0.752031 1.068277 0.625464 0.433967 0.306839 1.176193 0.103805 0.096743 0.980855 0.551427 0.096549 0.095214 0.046351 0.053343 0.853389 0.680149 0.469069 0.555274 0.0005 0.311667 1067 878536 156968 1076.774026 322.572067
12 12 1.769169 0.18186 1.772084 0.176119 1.83411 1.844026 0.028092 0.024488 0.036672 0.028815 0.045173 0.039682 0.051839 0.047161 0.0 0.0 166 152.794 46.067033 73 468 249 250 1 0.535325 0.799832 0.891145 0.685752 0.475168 0.291591 1.142953 0.106931 0.078072 0.989762 0.506 0.09566 0.093859 0.051289 0.056117 0.880376 0.72019 0.516044 0.519486 0.0005 0.34 1163 848198 144673 1061.730934 311.39659
13 13 1.743754 0.158582 1.726435 0.153854 1.81128 1.799296 0.025183 0.021926 0.030586 0.025736 0.040735 0.035313 0.04855 0.042916 0.0 0.0 168 153.432 51.590012 68 478 231 268 1 0.579754 0.784928 0.941172 0.672857 0.472446 0.284693 1.125724 0.159865 0.117188 0.97555 0.534392 0.094677 0.101375 0.043354 0.04914 0.861803 0.653665 0.446512 0.536952 0.0005 0.368333 1260 857260 138054 1159.378612 314.991295
Correlations 5 Columns that move together (r² > 0.25)
Outliers 5 Values > 3σ from the mean
Reinforcement learning training metrics tracking policy/value losses, game outcomes, and MCTS search over 13 iterations.

This dataset contains 13 records across 51 fields: iteration, loss_policy_train, loss_value_train, loss_policy_val, loss_value_val, loss_soft_policy_train, and 45 more.