Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Chess engine
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
====Differences between rating lists==== There are a number of factors that vary among the chess engine rating lists: * Number of games. [[Law of large numbers|More games when testing each engine]] result in higher [[statistical significance]]. * Formulae used to calculate the elo of each engine. * [[Time control]]: ** Longer time controls are better suited for determining tournament play strength, but also either make testing more time-consuming or the results less statistically significant. ** Increment time controls are better suited for determining tournament play strength since tournaments usually use increment time controls, but many rating lists use cyclic/repeating time controls instead. ** Consistent time controls throughout the rating list vs different time controls for each test. The latter results in a smaller [[statistical significance]] than the former because different time controls is a potential [[confounder]]. This is particularly problematic for CCRL because CCRL uses both cyclic/repeating time controls (40/15) and increment time controls (15"+10') in its CCRL 40/15 list yet maintains both time controls on the same list.<ref>{{Cite web|url=https://talkchess.com/forum3/viewtopic.php?f=6&t=82754|title=CCRL 40/15, 2m1s and FRC 40/2 lists updated (21-10-2023)|website=talkchess.com|access-date=22 October 2023|archive-date=15 June 2024|archive-url=https://web.archive.org/web/20240615081446/https://talkchess.com/forum3/viewtopic.php?f=6&t=82754|url-status=live}}</ref> * Opponents used in testing engines. ** Some rating lists only test an engine against the most recent version of each opponent engine, while other rating lists test an engine against the version(s) of each opponent engine closest in elo to the engine being tested. ** Most rating lists do not test every engine on the rating list vs every other engine on the rating list in a [[round-robin tournament]] format. This causes distortions in the rating lists, especially for CCRL and CEGT.<ref>{{Cite web|url=https://talkchess.com/viewtopic.php?t=83829|title=Experimental testruns of Stockfish / Torch|website=talkchess.com|access-date=29 May 2024|archive-date=15 June 2024|archive-url=https://web.archive.org/web/20240615081446/https://talkchess.com/viewtopic.php?t=83829|url-status=live}}</ref> * Hardware used: ** Faster hardware with more memory leads to stronger play. ** 64-bit (vs. 32-bit) hardware and operating systems favor [[bitboard]]-based programs ** Hardware using modern instruction sets such as [[AVX2]] or [[AVX512]] favor engines using vectors and vector intrinsics in their code, common in [[Artificial neural network|neural networks]]. ** [[Graphics processing units]] favor programs with [[deep neural networks]]. ** Multiprocessor vs. single processor hardware. ** Consistent hardware throughout the rating list vs different hardware for every test. The latter results in a smaller [[statistical significance]] than the former because different hardware is a potential [[confounder]]. This is particularly problematic for [[CEGT]] because multiple testers each with their own unique hardware are involved in testing each engine in CEGT.<ref>{{Cite web|url=http://www.cegt.net/testers/testers.html|title=CEGT Testers|website=Cegt.net|access-date=26 June 2022|archive-date=24 May 2022|archive-url=https://web.archive.org/web/20220524090535/http://www.cegt.net/testers/testers.html|url-status=live}}</ref> The same issue arises in [[CCRL]].<ref "CCRL issues">{{Cite web|url=https://talkchess.com/viewtopic.php?p=962945#p962945|title=ShashChess|author=Ray Banks (Modern Times)|website=talkchess.com|access-date=30 April 2024|archive-date=30 April 2024|archive-url=https://web.archive.org/web/20240430232313/https://talkchess.com/viewtopic.php?p=962945#p962945|url-status=live}}</ref> * Ponder settings (speculative analysis while the opponent is thinking) aka Permanent Brain. * Transposition table sizes. * GUI settings. * Opening book settings. These differences affect the results, and make direct comparisons between rating lists difficult.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)