Catan RL Training — Automatic Play

iteration loss_policy_train loss_value_train loss_policy_val loss_value_val gradient_steps game_length_avg game_length_stddev game_length_min game_length_max game_p1_wins game_p2_wins game_draws policy_entropy_avg policy_max_prob_avg policy_entropy_high_branch_avg policy_max_prob_high_branch_avg policy_agreement_avg value_z_avg value_q_avg value_z_stddev value_q_stddev value_correction_avg value_q_spread_avg value_error_early_avg value_error_late_avg value_network_stddev bench_wins bench_losses bench_draws lr q_weight mcts_sims replay_samples samples_iter time_selfplay_secs time_train_secs time_bench_secs
1 1 2.177857 0.124353 2.182894 0.12368 753 391.82 28.621686 261 402 62 76 12 0.61276 0.750411 0.642879 0.764643 0.312216 0.056454 -0.03952 0.59048 0.081665 0.105695 0.062848 0.54799 0.562514 0.018912 0 0 0 0.001 0.01 400 80233 80233 132.927991 40.621692 0.0
2 2 1.902223 0.091472 1.929586 0.096489 1950 398.213333 19.146309 253 402 56 81 13 0.678358 0.73538 0.967341 0.654916 0.584229 0.022963 0.058943 0.52021 0.420487 0.115382 0.059854 0.397581 0.35226 0.437281 0 0 0 0.001 0.02 404 207719 127486 206.234833 91.554848 0.0
3 3 1.791844 0.066477 1.781964 0.062388 3108 393.44 29.75802 208 402 61 74 15 0.542632 0.787347 0.74944 0.733701 0.52306 0.022825 -0.039515 0.559324 0.369987 0.135705 0.082413 0.44469 0.346395 0.403209 0 0 0 0.001 0.03 408 331267 123548 204.988928 146.1988 0.0
4 4 1.757926 0.058006 1.778482 0.052806 4194 350.506667 65.948338 180 402 52 88 10 0.60345 0.763272 0.83375 0.701288 0.51033 0.007438 0.002922 0.754719 0.443414 0.126479 0.073304 0.623355 0.531046 0.477275 0 0 0 0.001 0.04 412 447288 116021 192.218529 196.032728 0.0
5 5 1.73822 0.046324 1.773433 0.041718 5208 324.826667 80.868762 143 402 61 79 10 0.517782 0.79729 0.706892 0.746141 0.452112 0.026215 -0.039466 0.792385 0.502297 0.156855 0.090201 0.656838 0.536687 0.543355 0 0 0 0.001 0.05 416 555509 108221 178.446632 243.443996 0.0
6 6 1.723562 0.03834 1.744324 0.033786 6192 325.1 76.262245 113 402 80 66 4 0.53279 0.791909 0.756237 0.729138 0.466711 0.021808 -0.023995 0.81681 0.588114 0.161964 0.093601 0.736244 0.552058 0.632369 0 0 0 0.001 0.06 420 660469 104960 170.425003 289.131173 0.0
7 7 1.701725 0.033403 1.717879 0.036099 7035 295.6 78.039819 136 402 68 79 3 0.530392 0.792313 0.750406 0.731045 0.405151 0.01585 -0.022885 0.892591 0.61681 0.181807 0.110216 0.618008 0.525172 0.66685 0 0 0 0.001 0.07 424 750090 89621 157.841815 329.363986 0.0
8 8 1.692267 0.030236 1.707962 0.029044 7902 313.353333 80.451156 138 402 71 74 5 0.645966 0.745311 0.947782 0.658325 0.424487 0.028609 -0.006869 0.854752 0.688194 0.187782 0.106271 0.70025 0.526715 0.739344 0 0 0 0.001 0.08 428 842875 92785 158.513472 369.20086 0.0
9 9 1.675097 0.028836 1.693705 0.028374 8850 330.06 76.975947 160 402 61 80 9 0.51755 0.794526 0.75025 0.729088 0.469614 0.016469 -0.028901 0.771098 0.626623 0.197508 0.115065 0.66364 0.511526 0.682029 0 0 0 0.001 0.09 432 943794 100919 167.87355 413.524335 0.0
10 10 1.665112 0.028406 1.693577 0.031795 9633 297.633333 81.486229 151 402 56 91 3 0.552228 0.781994 0.794611 0.711062 0.38939 0.015664 0.004171 0.864636 0.578044 0.207535 0.116984 0.744521 0.529199 0.645008 0 0 0 0.001 0.1 436 1027489 83695 154.304071 449.613322 0.0
11 11 1.646423 0.028127 1.685998 0.030211 9663 287.626667 83.639827 134 402 76 69 5 0.571375 0.77349 0.809297 0.706168 0.420659 0.040658 -0.003013 0.885059 0.613561 0.210771 0.119772 0.740728 0.58017 0.678145 0 0 0 0.001 0.11 440 1030561 83305 153.096274 451.781107 0.0
12 12 1.64891 0.028125 1.681327 0.026443 9156 259.546667 76.436953 111 402 62 85 3 0.600805 0.763035 0.877221 0.680734 0.3954 0.03175 0.022479 0.937728 0.620071 0.216956 0.122459 0.816338 0.566607 0.687159 0 0 0 0.001 0.12 444 976555 73480 145.801105 428.110759 0.0
13 13 1.659477 0.029237 1.686066 0.029281 8766 277.053333 80.906018 124 402 65 81 4 0.629509 0.750777 0.89267 0.678505 0.406835 0.026879 -0.003832 0.898393 0.649197 0.195405 0.111542 0.785577 0.513682 0.709877 0 0 0 0.001 0.13 448 935023 82016 160.66426 410.00575 0.0
14 14 1.659065 0.029311 1.687179 0.027554 8463 278.88 80.705714 93 402 66 83 1 0.558252 0.778772 0.800991 0.710749 0.421479 0.042855 -0.019963 0.925287 0.626701 0.208242 0.123048 0.761978 0.574806 0.693063 0 0 0 0.001 0.14 452 902598 83596 152.796436 395.928797 0.0
15 15 1.651715 0.028745 1.699153 0.027429 8220 273.846667 78.767484 119 402 75 71 4 0.546855 0.783184 0.80513 0.709757 0.404153 0.014504 -0.039314 0.917848 0.63841 0.226237 0.129678 0.75782 0.529304 0.713523 0 0 0 0.001 0.15 456 876529 82152 156.620852 384.916445 0.0
16 16 1.63863 0.028677 1.682146 0.032831 8013 273.353333 85.202906 125 402 79 65 6 0.660473 0.740938 0.97604 0.6558 0.444775 0.027763 0.006443 0.878953 0.706826 0.185692 0.104613 0.646433 0.430996 0.760794 0 0 0 0.000999 0.16 460 854719 83150 156.167948 375.117337 0.0
17 17 1.638164 0.028545 1.689154 0.032277 7881 261.613333 73.666436 127 402 73 74 3 0.641093 0.748648 0.941923 0.661926 0.398344 0.042384 0.017714 0.937882 0.660024 0.216971 0.12242 0.811684 0.597283 0.728293 0 0 0 0.000999 0.17 464 840445 75347 143.609766 368.284822 0.0
18 18 1.629325 0.029442 1.667642 0.032771 7704 260.24 73.388072 106 402 81 68 1 0.612184 0.759373 0.888034 0.681898 0.418145 0.04637 0.011882 0.965223 0.668369 0.204363 0.116589 0.822869 0.597258 0.729912 0 0 0 0.000999 0.18 468 821620 73960 142.026246 360.452244 0.0
19 19 1.629657 0.02975 1.674164 0.030596 7497 264.986667 75.536215 119 402 94 53 3 0.642174 0.747584 0.949825 0.660774 0.406496 0.031596 -0.008726 0.917729 0.679771 0.207674 0.116379 0.735946 0.530219 0.744985 0 0 0 0.000999 0.19 472 799462 78761 164.097355 351.170964 0.0
20 20 1.621956 0.02973 1.666982 0.030287 7479 263.773333 82.787612 126 402 83 66 1 0.529097 0.793035 0.798471 0.713427 0.430797 0.038585 -0.035744 0.940727 0.634446 0.235682 0.13385 0.887128 0.590103 0.713741 5 15 0 0.000999 0.2 476 797548 81781 157.645028 350.29052 70.323937
21 21 1.615424 0.029962 1.66912 0.035078 7452 262.74 74.795938 104 402 66 82 2 0.578587 0.772986 0.850936 0.697067 0.417778 0.035451 -0.003737 0.92419 0.666441 0.214449 0.12123 0.824101 0.583499 0.732687 0 0 0 0.000999 0.21 480 794635 80392 151.519962 348.596229 0.0
22 22 1.610563 0.030043 1.657237 0.031245 7413 235.246667 71.42912 114 402 83 64 3 0.593209 0.768367 0.889448 0.680834 0.410274 0.041596 0.020359 0.966012 0.650519 0.221841 0.126922 0.856921 0.599133 0.720751 0 0 0 0.000999 0.22 484 790477 69322 153.447157 346.742354 0.0
23 23 1.601696 0.03038 1.638209 0.028958 7287 232.48 71.782609 92 402 93 55 2 0.546446 0.786669 0.816034 0.709917 0.386949 0.024405 -0.033145 0.956347 0.651602 0.231087 0.13777 0.827723 0.593883 0.72616 0 0 0 0.000999 0.23 488 777116 68655 161.697546 340.763231 0.0
24 24 1.594087 0.030507 1.648252 0.030441 7176 245.08 73.087301 109 402 73 77 0 0.587671 0.771454 0.884587 0.68652 0.393034 0.048797 -0.033028 0.957077 0.666394 0.222219 0.131039 0.82538 0.552867 0.735619 0 0 0 0.000999 0.24 492 765440 71920 152.399912 335.950243 0.0
25 25 1.59305 0.030947 1.648104 0.033068 7095 249.493333 69.892083 106 402 93 54 3 0.592466 0.768898 0.877182 0.688424 0.397009 0.027363 -0.016728 0.949702 0.669554 0.213836 0.123635 0.823397 0.516871 0.736685 0 0 0 0.000999 0.25 496 756581 73293 146.124654 333.190658 0.0
26 26 1.587472 0.030964 1.635905 0.031902 6921 230.426667 68.432677 111 402 77 73 0 0.569069 0.778745 0.840771 0.700031 0.358397 0.017939 -0.025659 0.992282 0.659718 0.213555 0.126793 0.818489 0.552041 0.722156 0 0 0 0.000999 0.26 500 738094 64663 143.481959 323.247039 0.0
27 27 1.578224 0.03067 1.635418 0.029875 6936 248.5 74.729891 116 402 70 79 1 0.626867 0.756123 0.916041 0.677749 0.412743 0.042321 -0.000995 0.963463 0.687066 0.180408 0.10544 0.729547 0.490691 0.73823 0 0 0 0.000998 0.27 504 739589 76842 154.627389 325.439352 0.0
28 28 1.57372 0.031413 1.63087 0.031917 6897 230.593333 67.78339 93 402 69 79 2 0.524596 0.795776 0.771164 0.726547 0.386382 0.02484 -0.011718 0.972239 0.631042 0.218694 0.127754 0.880288 0.563812 0.698448 0 0 0 0.000998 0.28 508 735376 69747 158.093002 322.912704 0.0
29 29 1.567569 0.031121 1.635012 0.032751 6816 238.366667 71.386219 102 402 83 66 1 0.580231 0.776023 0.870991 0.69433 0.383891 0.020395 -0.020487 0.955869 0.671013 0.199377 0.116749 0.828539 0.501843 0.727814 0 0 0 0.000998 0.29 512 727023 70408 152.035249 319.083833 0.0
30 30 1.570939 0.031684 1.624119 0.032922 6663 234.486667 65.223384 124 402 79 69 2 0.625441 0.757914 0.929837 0.670856 0.366303 0.039829 0.014345 0.974348 0.669091 0.206998 0.120057 0.830495 0.501232 0.730588 0 0 0 0.000998 0.3 516 710434 65192 143.237232 311.937701 0.0
31 31 1.566704 0.032022 1.624758 0.031253 6537 228.293333 69.660562 107 402 67 83 0 0.540945 0.790562 0.791185 0.720977 0.372057 0.030218 -0.010836 0.977065 0.625018 0.199681 0.11641 0.846948 0.610617 0.682621 0 0 0 0.000998 0.31 520 697072 67030 143.884323 306.054482 0.0
32 32 1.56147 0.032651 1.619309 0.034104 6606 251.42 71.775973 127 402 78 72 0 0.532251 0.792788 0.777483 0.724596 0.395102 0.014569 -0.023057 0.955358 0.629825 0.206405 0.120189 0.820394 0.56808 0.691144 0 0 0 0.000998 0.32 524 704353 76603 162.613905 310.39337 0.0
33 33 1.567229 0.032542 1.617486 0.034484 6633 242.533333 75.32407 120 402 72 78 0 0.590177 0.770619 0.868569 0.692376 0.397274 0.043553 0.010237 0.955893 0.634313 0.194086 0.109601 0.813757 0.600268 0.691735 0 0 0 0.000998 0.33 528 707301 71603 164.728006 312.173662 0.0
34 34 1.560991 0.032652 1.614261 0.032424 6606 219.433333 65.408401 107 402 90 60 0 0.534714 0.792152 0.765159 0.730709 0.399479 0.014754 -0.032554 0.97484 0.632762 0.196549 0.114687 0.866354 0.59685 0.683481 0 0 0 0.000998 0.34 532 704516 69135 167.60455 309.253722 0.0
35 35 1.558181 0.032277 1.636611 0.034991 6561 230.406667 72.929381 106 402 74 75 1 0.561953 0.781957 0.820342 0.71103 0.374823 0.034984 -0.019448 0.98091 0.648781 0.193036 0.111907 0.803687 0.536948 0.698974 0 0 0 0.000997 0.35 536 699626 68403 153.812481 307.578315 0.0
36 36 1.558286 0.032042 1.617023 0.032411 6594 226.16 66.215213 97 402 86 63 1 0.59112 0.772012 0.858764 0.699205 0.399008 0.029936 -0.009698 0.973595 0.64437 0.183551 0.105577 0.750106 0.536973 0.693861 0 0 0 0.000997 0.36 540 703325 68362 150.663214 308.813549 0.0
37 37 1.559312 0.033397 1.617 0.032549 6423 214.846667 62.215511 109 402 79 70 1 0.58052 0.776317 0.8452 0.70137 0.359487 0.035577 0.012701 0.987165 0.619705 0.193674 0.111035 0.895644 0.58045 0.671031 0 0 0 0.000997 0.37 544 684919 58436 131.986347 301.543617 0.0
38 38 1.558566 0.033686 1.621288 0.036082 6390 223.726667 63.07402 112 402 71 78 1 0.518799 0.799169 0.753616 0.732456 0.386646 0.017054 -0.014122 0.980326 0.605653 0.18807 0.109872 0.922996 0.634522 0.654013 0 0 0 0.000997 0.38 548 681491 66319 158.740409 298.77985 0.0
39 39 1.55684 0.033339 1.628934 0.035311 6363 226.926667 65.029849 109 402 83 66 1 0.510535 0.803212 0.73773 0.740378 0.365296 0.026106 -0.020866 0.980479 0.610263 0.190622 0.111634 0.839489 0.584141 0.655357 0 0 0 0.000997 0.39 552 678442 67359 161.806134 297.924591 0.0
40 40 1.554609 0.034597 1.616441 0.034891 6312 213.166667 57.784937 100 402 74 76 0 0.511438 0.802002 0.736125 0.739036 0.340401 0.031723 -0.010342 0.99383 0.594143 0.197588 0.115057 0.886636 0.613755 0.643691 4 16 0 0.000997 0.4 556 673144 59894 128.44055 295.713714 108.94367
41 41 1.552239 0.035246 1.626626 0.036039 6270 218.786667 65.360496 100 402 65 83 2 0.549224 0.789625 0.796688 0.719362 0.349428 0.021146 -0.006815 0.979837 0.607945 0.177222 0.104605 0.860389 0.562566 0.649568 0 0 0 0.000996 0.41 560 668536 62422 145.502664 294.384166 0.0
42 42 1.549737 0.035433 1.631427 0.038659 6108 210.56 59.212102 105 402 88 62 0 0.514015 0.802018 0.739924 0.73759 0.375133 0.02804 -0.008127 0.983531 0.586405 0.186208 0.108792 0.837036 0.573954 0.635127 0 0 0 0.000996 0.42 564 651312 59379 141.837917 286.273724 0.0
43 43 1.547777 0.0356 1.617633 0.036411 5985 210.626667 55.836493 87 402 73 76 1 0.525893 0.797243 0.772012 0.72773 0.363168 0.033037 -0.011672 0.990287 0.621017 0.187124 0.110293 0.832489 0.577083 0.665 0 0 0 0.000996 0.43 568 638098 58389 135.866092 280.916251 0.0
44 44 1.54754 0.03636 1.612705 0.035422 5904 216.9 66.816739 95 402 74 74 2 0.540119 0.792066 0.77279 0.726889 0.365326 0.017509 0.010429 0.973632 0.592328 0.185584 0.108457 0.891784 0.619282 0.64144 0 0 0 0.000996 0.44 572 629561 60598 143.43193 276.385936 0.0
45 45 1.544196 0.03638 1.615983 0.03633 5820 212.686667 60.849008 102 402 73 77 0 0.542294 0.792359 0.777828 0.726391 0.346915 0.050365 -0.005834 0.993878 0.601538 0.17158 0.101021 0.889434 0.631726 0.642335 0 0 0 0.000996 0.45 576 620723 59565 146.070702 272.569344 0.0
46 46 1.540623 0.037376 1.613946 0.037558 5718 210.12 62.404478 81 402 71 79 0 0.542125 0.792925 0.79023 0.722749 0.367956 0.044029 -0.008696 0.979136 0.608203 0.171207 0.100273 0.904836 0.593879 0.645643 0 0 0 0.000996 0.46 580 609914 57553 148.543038 268.404939 0.0
47 47 1.536305 0.03613 1.624177 0.037055 5739 212.413333 65.222203 91 402 89 60 1 0.5393 0.792448 0.766621 0.730846 0.369594 0.048676 0.006767 0.979521 0.601863 0.160611 0.093815 0.826444 0.539768 0.639102 0 0 0 0.000995 0.47 584 611847 60369 153.636133 269.630994 0.0
48 48 1.540088 0.03724 1.617354 0.038775 5670 212.693333 59.431748 102 402 83 67 0 0.564287 0.784145 0.805751 0.716357 0.354726 0.031954 0.021182 0.976983 0.606385 0.15972 0.093204 0.866229 0.645238 0.642776 0 0 0 0.000995 0.48 588 604675 59147 141.940928 265.981365 0.0
49 49 1.540653 0.038126 1.611965 0.039577 5577 209.346667 62.829822 93 402 80 70 0 0.544129 0.792538 0.789947 0.722905 0.350968 0.038368 0.004552 0.978255 0.598763 0.162156 0.094525 0.842521 0.619762 0.630705 0 0 0 0.000995 0.49 592 594760 57444 153.518449 260.685346 0.0
50 50 1.539367 0.03823 1.613155 0.038229 5577 211.893333 62.985728 86 402 75 73 2 0.537998 0.793995 0.778208 0.726861 0.363793 0.043982 -0.001452 0.979405 0.601201 0.163406 0.093523 0.860666 0.582315 0.638133 0 0 0 0.000995 0.5 596 594606 59740 145.505922 261.806925 0.0
Correlations 5 Columns that move together (r² > 0.25)
Outliers 20 Values > 3σ from the mean
171 training iterations of a reinforcement learning agent for Catan with automatic play. Tracks policy/value loss, gradient steps, game length, and win/loss/draw rates.

This dataset contains 171 records across 38 fields: iteration, loss_policy_train, loss_value_train, loss_policy_val, loss_value_val, gradient_steps, and 32 more.