AlphaZero-Style Self-Play Training Metrics (177 Iterations)

Policy loss, value loss, game length, and MCTS agreement tracked over 177 self-play iterations of AlphaZero-style reinforcement learning.
# iteration
loss_policy_train
loss_value_train
loss_policy_val
loss_value_val
gradient_steps
game_length_avg
game_length_stddev
game_length_min
game_length_max
game_wins
game_losses
game_draws
policy_entropy_avg
policy_max_prob_avg
policy_entropy_high_branch_avg
policy_max_prob_high_branch_avg
policy_agreement_avg
policy_agreement_high_branch_avg
value_z_avg
value_q_avg
value_z_stddev
value_q_stddev
value_correction_avg
value_correction_high_branch_avg
+16 cols →
1 1 2.407992 0.20431 2.360713 0.181675 186 345.903333 76.716452 117 412 140 147 13 1.314963 0.485351 1.94498 0.307379 0.176177 0.147149 0.047306 0.763073 0.066449 0.064318
2 2 1.82506 0.166375 1.863938 0.16579 618 351.956667 94.780807 102 484 122 168 10 0.790212 0.692195 1.087475 0.635492 0.458425 0.072638 0.07329 0.794482 0.65003 0.077136
3 3 1.644731 0.172948 1.715597 0.206918 999 277.036667 102.316805 80 482 150 144 6 0.67015 0.741367 0.921592 0.687925 0.467737 0.110777 0.085895 0.902588 0.706156 0.116371
4 4 1.582308 0.170746 1.646063 0.17154 1,191 212.936667 72.154829 84 426 151 147 2 0.750256 0.713533 1.142597 0.595025 0.311209 0.067831 0.167423 0.969897 0.612455 0.219683
5 5 1.539906 0.15205 1.632733 0.159638 1,419 215.606667 76.454029 81 438 157 139 4 0.702097 0.735178 1.055904 0.632314 0.380737 0.086472 0.145343 0.968282 0.670312 0.150867
6 6 1.507297 0.131911 1.551103 0.121757 1,620 198.126667 64.505224 70 434 155 145 0 0.751824 0.71669 1.138837 0.60608 0.361705 0.084483 0.148725 0.978515 0.716511 0.154507
7 7 1.480466 0.11357 1.541228 0.124165 1,857 208.216667 70.628628 93 458 134 165 1 0.701775 0.736917 1.074006 0.633138 0.404079 0.104553 0.114373 0.96429 0.7404 0.157397
8 8 1.462151 0.094938 1.540773 0.091184 2,112 213.253333 73.210126 85 482 148 151 1 0.616802 0.767704 0.967106 0.66954 0.437802 0.086453 0.05579 0.966329 0.75334 0.172627
9 9 1.463118 0.170173 1.524992 0.162274 57 193.966667 64.059859 96 444 133 166 1 0.667878 0.750754 1.040098 0.647249 0.435774 0.077558 0.076283 0.983551 0.779841 0.163994
10 10 1.44892 0.132393 1.505293 0.123528 117 181.763333 66.689784 84 462 144 155 1 0.592014 0.776632 0.941136 0.679547 0.452339 0.064768 0.058417 0.976216 0.769504 0.162214
11 11 1.417094 0.11607 1.488111 0.11986 171 176.816667 51.234914 79 428 141 158 1 0.640494 0.759457 1.026294 0.649403 0.44316 0.086389 0.073781 0.992172 0.778276 0.171837
12 12 1.40421 0.102969 1.482327 0.130042 222 167.816667 51.082055 87 454 144 155 1 0.622973 0.766591 0.965944 0.670829 0.438075 0.085642 0.062645 0.986126 0.770032 0.176492
13 13 1.388012 0.109542 1.481974 0.12043 270 173.986667 51.378334 84 450 157 142 1 0.687521 0.743335 1.052557 0.637123 0.400263 0.108933 0.134833 0.982891 0.7455 0.204704
14 14 1.364462 0.092952 1.454948 0.097873 315 168.61 52.259014 83 434 141 157 2 0.663837 0.751615 1.002019 0.654931 0.408552 0.091132 0.148626 0.984811 0.734644 0.202724
15 15 1.337836 0.092199 1.437554 0.096812 360 169.423333 53.968918 78 450 157 143 0 0.656401 0.752929 0.983647 0.660661 0.41383 0.08956 0.110519 0.991501 0.756486 0.187203
16 16 1.310459 0.091747 1.423533 0.087746 411 173.656667 54.519955 83 468 168 131 1 0.638624 0.759601 0.981285 0.662887 0.430236 0.084393 0.106311 0.983002 0.758783 0.185784
17 17 1.286999 0.086561 1.41554 0.101455 459 164.363333 42.653855 75 294 149 151 0 0.588605 0.779407 0.924123 0.681955 0.43477 0.096031 0.085621 0.995378 0.744908 0.1925
18 18 1.272646 0.081012 1.404538 0.085419 510 168.093333 52.973685 80 448 134 165 1 0.573125 0.785469 0.879483 0.700039 0.436868 0.083669 0.061315 0.986707 0.751189 0.189503
19 19 1.25391 0.079185 1.387003 0.081274 558 158.44 51.38949 63 456 155 145 0 0.58085 0.78248 0.895219 0.692719 0.44574 0.086375 0.0712 0.98796 0.750711 0.187893
20 20 1.635647 0.345896 1.593549 0.250605 26 158.518 47.653139 81 450 254 246 0 0.6005 0.775025 0.909287 0.688188 0.429732 0.092849 0.081633 0.992904 0.747818 0.181044
21 21 1.55624 0.272088 1.534002 0.226235 52 155.618 44.457171 72 448 241 257 2 0.577885 0.784217 0.866142 0.70039 0.416983 0.085716 0.085451 0.986349 0.668385 0.133429
22 22 1.516689 0.23125 1.513051 0.194863 79 156.068 46.935992 70 458 269 231 0 0.583886 0.781094 0.844942 0.711643 0.434256 0.10547 0.090314 0.989173 0.703908 0.124203
23 23 1.492431 0.203033 1.497127 0.175039 108 155.426 59.069794 67 472 257 243 0 0.535536 0.799164 0.812534 0.72207 0.458156 0.088717 0.070105 0.974474 0.700028 0.140025
24 24 1.479463 0.183633 1.484884 0.158837 133 149.096 48.087408 72 476 260 239 1 0.56872 0.787651 0.862148 0.704561 0.444825 0.100412 0.075662 0.984681 0.713049 0.141701
25 25 1.467135 0.166041 1.468577 0.150338 158 143.742 42.414472 65 440 268 230 2 0.600219 0.77575 0.877914 0.699172 0.440465 0.115806 0.11227 0.982272 0.705627 0.133342
26 26 1.455291 0.160098 1.464196 0.146918 182 146.654 38.670328 71 329 259 241 0 0.598359 0.776127 0.895271 0.691749 0.436944 0.08744 0.110627 0.99617 0.708739 0.14312
27 27 1.449591 0.149152 1.449721 0.140612 207 137.96 37.518134 65 330 246 254 0 0.523905 0.804521 0.79572 0.727769 0.466838 0.080063 0.063046 0.99679 0.701501 0.139584
28 28 1.438782 0.141813 1.45783 0.129802 231 144.854 38.936316 75 332 251 249 0 0.5758 0.785609 0.876261 0.698476 0.444752 0.086718 0.087272 0.996233 0.703593 0.148153
29 29 1.433784 0.133265 1.442865 0.121089 257 145.056 39.886048 75 375 250 250 0 0.584125 0.779132 0.84563 0.710208 0.451232 0.121855 0.10753 0.992548 0.730062 0.12869
30 30 1.424715 0.120485 1.426066 0.109924 281 142.248 38.539622 72 456 243 257 0 0.597834 0.777374 0.901512 0.691854 0.457732 0.118416 0.107999 0.989923 0.725657 0.13122
31 31 1.416119 0.115314 1.424774 0.106615 305 142.062 40.389481 70 460 253 247 0 0.580568 0.78403 0.873009 0.70064 0.463206 0.109231 0.112256 0.987491 0.711374 0.135169
32 32 1.40741 0.109786 1.420908 0.102778 330 141.234 34.877088 70 440 279 221 0 0.598987 0.77555 0.886448 0.697542 0.463733 0.127788 0.125232 0.989487 0.730108 0.129943
33 33 1.402115 0.105741 1.41019 0.10433 353 135.132 33.641144 72 309 248 252 0 0.588448 0.779436 0.862214 0.704234 0.456191 0.130589 0.115615 0.991437 0.722655 0.12489
34 34 1.395507 0.101224 1.407022 0.096257 377 138.784 37.037729 74 440 260 239 1 0.6225 0.768376 0.92883 0.681511 0.452697 0.123334 0.135222 0.987301 0.72426 0.129967
35 35 1.389048 0.097046 1.404308 0.091622 399 136.612 35.35717 73 298 276 224 0 0.590707 0.779545 0.87511 0.699566 0.451224 0.125762 0.13098 0.99206 0.708763 0.133565
36 36 1.382245 0.093552 1.389348 0.087264 421 130.542 30.995229 70 287 259 241 0 0.586051 0.783001 0.875933 0.700615 0.469231 0.140842 0.14755 0.990032 0.716348 0.121767
37 37 1.374056 0.090479 1.383406 0.085172 445 135.946 36.490753 74 408 249 250 1 0.605865 0.773457 0.890263 0.695265 0.45975 0.139051 0.146275 0.986173 0.717547 0.123241
38 38 1.369268 0.087104 1.385413 0.085036 468 135.168 32.300771 64 269 276 224 0 0.613469 0.77052 0.895294 0.69344 0.463305 0.153456 0.143658 0.988155 0.73219 0.118592
39 39 1.365429 0.084522 1.384179 0.080782 491 135.916 35.186659 69 289 250 250 0 0.600579 0.773522 0.86318 0.70383 0.457518 0.137555 0.140415 0.990494 0.721086 0.123793
40 40 1.359174 0.082675 1.419101 0.081544 516 136.146 37.638181 61 468 251 249 0 0.570385 0.787091 0.851999 0.708933 0.481598 0.120965 0.119263 0.989179 0.732243 0.113031
41 41 1.355472 0.080755 1.366543 0.077799 540 137.104 44.487315 72 472 266 231 3 0.567181 0.787249 0.831141 0.71472 0.461986 0.102199 0.117119 0.979497 0.707141 0.122474
42 42 1.350257 0.079082 1.370358 0.077349 563 130.656 34.155785 69 446 252 248 0 0.583279 0.783145 0.865072 0.703964 0.474576 0.122687 0.126228 0.989221 0.722216 0.110764
43 43 1.345581 0.076946 1.361307 0.0741 585 126.99 30.040271 66 356 271 229 0 0.561987 0.789689 0.850113 0.707985 0.468453 0.123663 0.1095 0.992324 0.719087 0.11322
44 44 1.340379 0.074895 1.351114 0.072543 607 126.574 32.168937 56 329 269 231 0 0.561237 0.790812 0.827994 0.716544 0.469222 0.125492 0.118993 0.992095 0.709313 0.113853
45 45 1.337022 0.073621 1.348928 0.080853 630 127.096 35.480005 58 424 256 243 1 0.538079 0.798386 0.799783 0.7244 0.469758 0.106475 0.101663 0.990462 0.700774 0.114908
46 46 1.333747 0.072344 1.34512 0.072535 653 120.612 26.850502 58 214 262 238 0 0.497274 0.81394 0.742504 0.744389 0.502042 0.11585 0.091239 0.993267 0.721473 0.105089
47 47 1.329386 0.071233 1.343563 0.070125 676 125.844 33.370341 68 450 254 245 1 0.518304 0.805832 0.770566 0.734606 0.48399 0.116998 0.101534 0.987839 0.710857 0.107984
48 48 1.417927 0.041017 1.39409 0.034073 22 126.474 31.189763 67 360 281 219 0 0.563907 0.790047 0.855867 0.705495 0.482078 0.139553 0.139521 0.990215 0.726657 0.10413
49 49 1.345396 0.029261 1.330821 0.027969 42 117.076 25.347549 59 258 256 244 0 0.535568 0.801546 0.808536 0.719903 0.485718 0.168965 0.147511 0.985622 0.677682 0.081421
50 50 1.287963 0.025475 1.2919 0.025217 63 116.926 27.187213 60 231 249 251 0 0.511622 0.810156 0.779749 0.72872 0.509976 0.14422 0.136344 0.989546 0.674869 0.075818
51 51 1.249573 0.022418 1.261851 0.025329 85 113.54 30.021666 67 458 268 231 1 0.499741 0.815599 0.767092 0.73428 0.538795 0.154681 0.135768 0.981614 0.684229 0.067932
52 52 1.239862 0.020708 1.238892 0.020689 105 106.894 23.381761 60 205 265 235 0 0.460949 0.82778 0.703409 0.754329 0.531776 0.136055 0.108569 0.990701 0.689815 0.070366
53 53 1.216658 0.018959 1.227611 0.018838 125 107.624 27.247433 57 375 282 218 0 0.488484 0.818383 0.72863 0.746468 0.526877 0.145199 0.122105 0.989402 0.668965 0.065449
54 54 1.195814 0.017443 1.193627 0.017818 145 110.556 25.541943 59 273 278 222 0 0.558829 0.791464 0.815475 0.717153 0.522584 0.183324 0.170542 0.983053 0.69147 0.061695
55 55 1.181853 0.016167 1.187253 0.01723 165 109.55 27.683777 66 462 269 231 0 0.521735 0.805673 0.763296 0.734004 0.542278 0.174444 0.153388 0.98022 0.694784 0.057881
56 56 1.164838 0.015336 1.174144 0.016499 184 111.584 25.774618 56 220 260 240 0 0.529227 0.804215 0.796142 0.721893 0.545674 0.182881 0.164911 0.983135 0.6623 0.061193
57 57 1.150243 0.014417 1.165659 0.01481 203 111.512 24.705098 59 210 270 230 0 0.552733 0.795296 0.812431 0.716727 0.538387 0.180791 0.183029 0.983521 0.670567 0.061576
58 58 1.132961 0.012485 1.13431 0.011784 202 105.29 23.849149 54 215 255 245 0 0.454469 0.829633 0.677598 0.761883 0.549192 0.132371 0.109861 0.9912 0.658344 0.062526
59 59 1.118882 0.011485 1.13125 0.011226 202 109.772 32.213289 60 452 262 236 2 0.47546 0.821389 0.694657 0.754939 0.562122 0.1604 0.139924 0.976591 0.662826 0.05635
60 60 1.106358 0.010823 1.125652 0.011466 201 107.118 23.363563 58 224 285 215 0 0.512392 0.808068 0.748659 0.737158 0.569741 0.186862 0.16779 0.982386 0.672253 0.049562
61 61 1.090359 0.010652 1.102122 0.009972 200 111.464 25.377957 55 199 280 220 0 0.496226 0.815028 0.751153 0.736458 0.56797 0.170696 0.154498 0.985324 0.654892 0.056282
62 62 1.071331 0.009812 1.080383 0.00989 199 106.95 21.939633 59 202 265 235 0 0.524397 0.804337 0.765595 0.731618 0.575487 0.185444 0.168536 0.982655 0.67899 0.048331
63 63 1.056726 0.009427 1.073534 0.009874 198 107.416 25.155416 55 277 284 216 0 0.504803 0.811518 0.747197 0.737013 0.568766 0.177427 0.166104 0.984134 0.660361 0.052602
64 64 1.048425 0.00913 1.051124 0.009945 199 108.988 22.918723 59 225 286 214 0 0.483193 0.817827 0.687505 0.757601 0.575568 0.175175 0.164564 0.984537 0.663841 0.051675
65 65 1.02601 0.008905 1.04511 0.009039 199 111.218 27.413837 59 388 257 243 0 0.504086 0.810599 0.743311 0.736601 0.578276 0.182855 0.174913 0.98314 0.650485 0.051579
66 66 1.019539 0.008703 1.030799 0.008047 201 106.626 22.892228 57 219 291 209 0 0.460647 0.825981 0.670021 0.761779 0.580847 0.177696 0.147334 0.984085 0.662885 0.051503
67 67 1.006764 0.008114 1.009292 0.00907 202 108.81 23.474622 52 202 289 211 0 0.489778 0.81464 0.720655 0.745099 0.603303 0.215343 0.188263 0.976539 0.667405 0.044213
68 68 0.991884 0.007757 1.0042 0.007348 202 111.698 24.702121 62 243 275 225 0 0.521644 0.803532 0.753913 0.732835 0.578887 0.194761 0.181715 0.980851 0.646584 0.048742
69 69 0.981284 0.007379 0.989306 0.007219 203 112.482 31.14729 65 309 305 195 0 0.476694 0.817953 0.691294 0.753563 0.59466 0.183895 0.172383 0.982946 0.658418 0.04715
70 70 0.974126 0.007237 0.98859 0.007193 204 106.928 22.687504 58 196 268 232 0 0.450194 0.829242 0.663191 0.762485 0.607894 0.184096 0.152382 0.982908 0.648774 0.046888
71 71 0.968922 0.007127 0.979764 0.006759 204 108.238 23.357597 58 195 269 231 0 0.474681 0.820149 0.683056 0.757493 0.600373 0.185426 0.174246 0.982658 0.643669 0.047464
72 72 0.958206 0.006835 0.969687 0.006537 205 109.958 23.563112 65 189 263 237 0 0.491018 0.814198 0.703647 0.749904 0.608275 0.202281 0.184746 0.979327 0.658843 0.042986
73 73 0.948706 0.006822 0.960066 0.006602 207 109.76 24.232755 56 202 262 238 0 0.478748 0.818316 0.699574 0.749739 0.612425 0.200534 0.188801 0.979687 0.647938 0.043876
74 74 0.940282 0.00656 0.942494 0.00599 206 106.736 24.869224 56 262 274 226 0 0.451258 0.829271 0.66248 0.76293 0.607396 0.173362 0.161285 0.984858 0.628874 0.047403
75 75 0.933999 0.006307 0.937166 0.005948 207 106.106 22.159936 61 182 269 231 0 0.466405 0.822481 0.672041 0.759317 0.619722 0.210275 0.184593 0.977642 0.6512 0.042919
76 76 0.927803 0.006076 0.933055 0.00593 207 107.74 24.216697 53 214 274 226 0 0.453595 0.827617 0.64253 0.770212 0.622055 0.187377 0.167234 0.982288 0.653436 0.04298
77 77 0.921275 0.006051 0.925338 0.005692 207 106.91 24.106387 55 237 269 231 0 0.454296 0.827022 0.6574 0.763887 0.614935 0.181879 0.165066 0.983321 0.641329 0.04469
78 78 0.914753 0.00589 0.924121 0.006543 208 107.176 27.053891 55 448 284 215 1 0.468477 0.822327 0.682376 0.755055 0.618666 0.196115 0.179519 0.973816 0.645064 0.042206
79 79 0.912348 0.005963 0.917488 0.005361 207 105.646 23.097114 57 203 297 203 0 0.449206 0.828292 0.633592 0.772715 0.623761 0.197996 0.174875 0.980203 0.659843 0.042817
80 80 0.905129 0.005927 0.919205 0.005354 208 105.736 22.461574 61 209 276 224 0 0.440907 0.832194 0.63913 0.770855 0.627866 0.181264 0.163658 0.983435 0.646928 0.042089
81 81 0.900703 0.005747 0.915968 0.005488 208 104.39 22.62189 61 214 271 229 0 0.438 0.833319 0.631104 0.773232 0.628766 0.198812 0.170502 0.980038 0.63897 0.042167
82 82 0.896871 0.005572 0.903751 0.005734 207 104.654 22.978039 57 184 277 223 0 0.429192 0.836331 0.623797 0.774938 0.633698 0.19355 0.167621 0.98109 0.639416 0.042717
83 83 0.89366 0.005606 0.911204 0.005809 207 106.018 23.237162 48 243 267 233 0 0.455795 0.826453 0.649079 0.766595 0.629693 0.202722 0.184159 0.979236 0.64954 0.041536
84 84 0.891766 0.005552 0.898335 0.005353 206 106.054 22.166801 57 191 277 223 0 0.445829 0.830744 0.65423 0.765055 0.623667 0.185483 0.166134 0.982647 0.622993 0.044233
85 85 0.8878 0.005703 0.903053 0.005169 207 107.226 26.377167 55 266 269 231 0 0.435047 0.833407 0.621582 0.776381 0.628669 0.188733 0.168466 0.982029 0.636478 0.042503
86 86 0.883372 0.00559 0.900034 0.0054 206 104.258 27.563662 56 426 255 244 1 0.443405 0.831193 0.626786 0.774517 0.629718 0.194865 0.179582 0.975927 0.642202 0.041945
87 87 0.994733 0.004434 0.993952 0.00437 21 104.856 22.588831 57 215 282 218 0 0.443531 0.830888 0.620889 0.776411 0.628733 0.188621 0.16412 0.98205 0.639777 0.04006
88 88 0.952073 0.004736 0.957957 0.004837 43 108.492 27.366877 49 351 262 238 0 0.441165 0.83226 0.627388 0.775114 0.63347 0.178186 0.168242 0.983997 0.613937 0.04141
89 89 0.944124 0.004773 0.952224 0.004667 64 105.77 24.823318 54 230 265 235 0 0.412075 0.842267 0.591282 0.786587 0.641164 0.177442 0.152155 0.984131 0.628312 0.041423
90 90 0.927422 0.004555 0.930428 0.004501 85 106.24 24.770838 56 231 286 214 0 0.45189 0.828684 0.639205 0.771147 0.643611 0.205202 0.181003 0.97872 0.647069 0.036781
91 91 0.920805 0.004751 0.927769 0.005239 106 102.582 22.3176 54 198 287 213 0 0.420843 0.838889 0.596145 0.784869 0.631653 0.179058 0.146309 0.983838 0.609842 0.040782
92 92 0.921096 0.004941 0.926794 0.004561 126 101.766 23.53158 58 216 283 217 0 0.397598 0.848388 0.556245 0.799732 0.635949 0.172883 0.14256 0.984942 0.619738 0.043367
93 93 0.903853 0.004715 0.905873 0.004714 146 101.232 21.60005 56 206 284 216 0 0.434669 0.835264 0.632259 0.77235 0.63783 0.187846 0.1644 0.982199 0.608925 0.039774
94 94 0.898365 0.004739 0.901329 0.004694 167 103.33 26.276626 52 318 271 229 0 0.402849 0.845869 0.580094 0.790596 0.634348 0.168564 0.149073 0.985691 0.608937 0.041636
95 95 0.885494 0.004778 0.896657 0.005026 187 106.938 24.352539 51 185 295 205 0 0.445843 0.830489 0.63488 0.772673 0.639496 0.186306 0.171379 0.982492 0.620603 0.039846
96 96 0.873009 0.004751 0.900855 0.004625 207 103.282 22.185907 59 199 265 235 0 0.468463 0.822827 0.671876 0.75894 0.638877 0.214858 0.197605 0.976645 0.618792 0.038811
97 97 0.876918 0.004861 0.876009 0.005219 206 104.8 26.198626 46 380 284 216 0 0.4488 0.830013 0.63871 0.77098 0.629974 0.19783 0.177693 0.980236 0.612491 0.040386
98 98 0.877517 0.004823 0.881616 0.004303 204 99.048 22.670635 59 194 268 232 0 0.394516 0.849514 0.566945 0.794961 0.644592 0.177411 0.146701 0.984137 0.605502 0.041302
99 99 0.86834 0.005026 0.878584 0.00446 202 101.674 21.614433 59 201 285 215 0 0.446072 0.830697 0.637442 0.771035 0.639255 0.183766 0.16687 0.98297 0.615435 0.037757
100 100 0.867111 0.004837 0.874864 0.004299 202 102.256 23.001532 59 241 288 212 0 0.429856 0.836279 0.597519 0.784638 0.635025 0.186586 0.164839 0.982439 0.625175 0.039268
Double-click to expand
Sign in to edit this dataset. Sign in

Expand Analysis

Embed this dataset

Paste this code into your blog or website. Readers can search, sort, and paginate the data.

<iframe src="https://data.tablepage.ai/d/alphazero-style-self-play-training-metrics-177-iterations?embed=1" width="100%" height="500" frameborder="0"></iframe>

Works on WordPress, Ghost, and any site that supports iframes.

Drop to create a new dataset CSV, TSV, or Excel
Uploading...

Upload your own dataset

Explore any CSV with AI insights, charts & filters. Free, no account needed.