Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Pythagorean expectation
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Empirical origin== Empirically, this formula correlates fairly well with how baseball teams actually perform. However, statisticians since the invention of this formula found it to have a fairly routine error, generally about three games off. For example, the [[2002 New York Yankees season|2002 New York Yankees]] scored 897 runs and allowed 697 runs: according to James' original formula, the Yankees should have finished with a win percentage of .624. :<math>\text{Win} = \frac{897^2}{897^2 + 697^2} = 0.624</math> Based on a 162-game season, the 2002 Yankees should have finished 101-61: they actually finished 103β58.<ref>{{cite web|url=https://www.baseball-reference.com/teams/NYY/2002.shtml|title=2002 New York Yankees|work=Baseball-Reference.com|access-date=7 May 2016}}</ref> In efforts to fix this routine error, statisticians have performed numerous searches to find the ideal exponent. If using a single-number exponent, 1.83 is the most accurate, and is the one used by baseball-reference.com.<ref>{{cite web|url=https://www.sports-reference.com/blog/baseball-reference-faqs/|title=Frequently Asked Questions|work=Baseball-Reference.com|access-date=7 May 2016}}</ref> The updated formula therefore reads as follows: :<math>\text{Win} = \frac{\text{runs scored}^{1.83}}{\text{runs scored}^{1.83} + \text{runs allowed}^{1.83}} = \frac{1}{1+(\text{runs allowed}/\text{runs scored})^{1.83}}</math> The most widely known is the Pythagenport formula<ref name="baseballprospectus">{{cite web|url=http://www.baseballprospectus.com/article.php?articleid=342|title=Baseball Prospectus β Revisiting the Pythagorean Theorem|work=Baseball Prospectus|date=30 June 1999 |access-date=7 May 2016}}</ref> developed by [[Clay Davenport]] of [[Baseball Prospectus]]: :<math>\mathrm{Exponent} = 1.50 \log\left(\frac{\text{runs scored} + \text{runs allowed}}{\text{games}}\right) +0.45</math> He concluded that the exponent should be calculated from a given team based on the team's runs scored, runs allowed, and games. By not reducing the exponent to a single number for teams in any season, Davenport was able to report a 3.991 root-mean-square error as opposed to a 4.126 root-mean-square error for an exponent of 2.<ref name="baseballprospectus" /> Less well known but equally (if not more) effective is the {{visible anchor|Pythagenpat}} formula, developed by David Smyth.<ref>{{cite web|url=http://gosu02.tripod.com/id69.html|title=W% Estimators|access-date=7 May 2016}}</ref> :<math>\text{Exponent} = \left(\frac{\text{runs scored} + \text{runs allowed}}{\text{games}}\right)^{0.287} </math> Davenport expressed his support for this formula, saying: <blockquote> After further review, I (Clay) have come to the conclusion that the so-called Smyth/Patriot method, aka Pythagenpat, is a better fit. In that, ''X'' = ((''rs'' + ''ra'')/''g'')<sup>0.287</sup>, although there is some wiggle room for disagreement in the exponent. Anyway, that equation is simpler, more elegant, and gets the better answer over a wider range of runs scored than Pythagenport, including the mandatory value of 1 at 1 rpg.<ref>{{cite web|url=http://baseballprospectus.com/glossary/index.php?mode=viewstat&stat=136|title=Baseball Prospectus β Glossary|access-date=7 May 2016}}</ref> </blockquote> These formulas are only necessary when dealing with extreme situations in which the average number of runs scored per game is either very high or very low. For most situations, simply squaring each variable yields accurate results. There are some systematic statistical deviations between actual winning percentage and expected winning percentage, which include [[bullpen]] quality and luck. In addition, the formula tends to [[Regression toward the mean|regress toward the mean]], as teams that win a lot of games tend to be underrepresented by the formula (meaning they "should" have won fewer games), and teams that lose a lot of games tend to be overrepresented (they "should" have won more). A notable example is the [[2016 Texas Rangers season|2016 Texas Rangers]], who beat their predicted record by 13 games, finishing 95-67 while having an expected winβloss record of 82-80.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)