Player Win Averages for 1957-2006
Inspired by the book "Player Win Averages" (1970) by Harlan & Eldon Mills
[ Jeff's Sports
Ratings |
USA Today - Jeff Sagarin computer ratings ]
Email:
Jeff Sagarin
All Contents Copyright © 2007 Jeff Sagarintm. All Rights
Reserved.
Directory of 1957-2006 Seasons below
My computer program implementing the Mills approach needs
digitized play-by-play data of the games and so I would like to
give a huge thanks to David W. Smith of Retrosheet
for his tremendous interest and generousity. Both of which were
needed and greatly appreciated by me. And so I would like to officially say:
==================================================================================
The play-by-play information used here was obtained free of
charge from and is copyrighted by Retrosheet. Interested
parties may contact Retrosheet at "www.retrosheet.org".
Retrosheet
==================================================================================
EXPLANATION
When I was a 21-year old senior(math major) at MIT in the spring of 1970 I saw
a small ad in the back of the Sporting News and soon bought a little paperback book
entitled "Player Win Averages" by brothers Harlan Mills and Eldon Mills.
I instantly could see that they had gotten to the very heart of the
matter as to rating batters and pitchers. And so it's always been in me since
then to implement the concepts. I wrote my first version of baseball simulation
on the computer in the spring of 1980. And of course have fiddled with it
over the years to make it better. The other two books which have inspired me
in this vein were Earnshaw Cook's two books: "Percentage Baseball"(1964)
and "Percentage Baseball and the Computer"(1971) And finally,
the empirical research into probabilties of winning from different situations by
the Royal Canadian Air Force officers George Lindsey and his father Charles Lindsey
in the 1950s and early 1960s.
The Mills' concept is brilliant and elegantly simple: at each state in baseball game,
each team has a certain probability of winning. And then a play occurs and there is a new
state and associated probability of winning for each team. By comparing the BEFORE and AFTER
probabilities and assigning the credit for the CHANGE to the batters and pitchers, and doing
this process for a whole season one can determine by how much each batter and pitcher has helped
his team win(and lose!) baseball games.
For example, in 1951, in the third and final game of the National League Playoffs between the
Brooklyn Dodgers and New York Giants, it was the bottom of the 9th and the Giants were trailing 4-2,
with men on 2nd and third, and one out.
Thus as Bobby Thomson stepped into the batters box, the chances were:
(based on 1951 Major League Composite Play - to determine these probabilities, one must use the
technique of computer simulation to have the computer play literally hundreds of millions of innings
for all possible situations, using the major league statistics for the given year.)
And then to rate the players, one needs to connect those situation probabilities
with a digitized play-by-play of all the games of a given season.
_________________________________________________________________________________________
9 bottom HOME team MARGIN= -2 1 out(s) 2nd and 3rd balls= 0 strikes= 0
HOME WIN CHANCE= 0.27759 AWAY WIN CHANCE= 0.72241 HOME FIELD
GAME STATUS= -445 = 2000*HOME WIN CHANCE - 1000 HOME FIELD
_________________________________________________________________________________________
and after Bobby Thomson hit his fabled homerun ... THE GIANTS WIN THE PENNANT! THE GIANTS WIN THE PENNANT!
the situation was as follows:
_________________________________________________________________________________________
9 bottom HOME team MARGIN= 1 1 out(s) none on balls= 0 strikes= 0
HOME WIN CHANCE= 1.00000 AWAY WIN CHANCE= 0.00000 HOME FIELD
GAME STATUS= 1000 = 2000*HOME WIN CHANCE - 1000 HOME FIELD
_________________________________________________________________________________________
BEFORE: GAME STATUS= -445
AFTER: GAME STATUS= +1000
CHANGE: 1000 - (-445) = +1445 for Bobby Thomson, -1445 for Ralph Branca
Thus the Giants chances changed from .27759 to 1.00000 or a jump of .72241 = 72.241%.
The Mills brothers liked to use a -1000 to +1000 scale and so that is shown
as a GAME STATUS change from -445 to +1000 or a gain of 1445 points which is precisely
2000 * .72241 rounded off to the nearest integer. So in this case, Bobby Thomson
would be credited with 1445 WIN points and Ralph Branca (the pitcher) would be
credited with 1445 LOSS points. Over the course of a season each batter and pitcher
will accumulate WIN and LOSS points from every play he was involved in. This
also applies to a runner STEALING a base or being CAUGHT STEALING for example.
You simply compare the BEFORE and AFTER game statuses to determine the WIN and LOSS
points to give out.
And the Mills PWA(Player Win Average) is simply the final WIN Points/(WIN Points + LOSS Points)
The scale factor of 2000 used is totally arbitrary and has no effect on the PWA
because it appears equally in the numerator and denominator and cancels out.
WIN Points = probability change * SCALE
LOSS Points= probability change * SCALE
and so you can see that SCALE cancels out.
By the way, in their book, Harlan & Eldon Mills calculate a change of 1472 points.
If I were to have used the 1969 Major League Composite data - which is the database
they used - I would have gotten 1474 points. The difference between my 1445 and
the 1472 and 1474 figures is due to the fact that scoring was a higher rate
in 1951 than in 1969;
4.57 runs per 9 innings per team in 1951, 4.09 runs per 9 innings per team in 1969.
Thus it was slightly less impressive for Thomson to have hit that homerun in 1951
than it would have been in 1969 given the situation he was batting in.
And to quote from pages 25-26 of the Mills' book regarding the PWA:
"Here's something to keep in mind, and it also explains
why we think this measurement system is equitable for the
players.
The players are not measured against any arbitrary
standard. They are measured against their own teammates
and opponents on how they performed this year. Over the
year, using our new scorecard, we tabulate every play of
every game. We know what actually happened - how many
times each situation moved to each next situation. This
gives us an average of what will happen on each next play,
as actually performed by the players.
So when we score each player against that average, we
are really scoring him against his fellow players and opponents.
The player who conforms to the average will have
exactly the same number of Win and Loss Points (Net Points=0, js note),
for a .500 Player Win Average. Those who are better than
average will be above .500, and those who are less than
average will be below .500, no matter what their batting
average or earned run average may be.
To illustrate, if it were a common, every-day occurrence
for a player to hit a game-winning home run in the ninth,
then those who did not would be below average. Since
this is not the case, those who do not are not necessarily
below average. Also, in a year when the hitters are big, and
ten runs a game are commonplace, a player had better be
up there getting his share, or he'll be below average. On
the other hand, in a year like 1968, an average hitter
needn't have done so much, since low scoring games were
the rule.
In other words, we do not measure players from one
era against players from another. We measure them against
their own teammates and opponents. But the statistic itself -
Player Win Average - can be used to compare players
of any era. That's because, in any era, whether the ball be
dead or rabbit-like, a .500 ball player will be average, and
a .570 player will be much better than average."
Directory of 1957-2006 Seasons
For the seasons of 1957-1973, there are some games missing. But for 1974-2006 the data should be complete.
Just scroll to the end of a given file and you'll see the total number of games in the database.
You'll find my nomenclature in the files below to be quite similar to the Mills book. That is my way
of honoring Harlan Mills and Eldon Mills for their evolutionary and revolutionary book. They were
way ahead of their time and it's about time they got their due. They were my inspiration to pursue
this topic. - Jeff Sagarin
1957 American League
1957 National League
1958 American League
1958 National League
1959 American League
1959 National League
1960 American League
1960 National League
1961 American League
1961 National League
1962 American League
1962 National League
1963 American League
1963 National League
1964 American League
1964 National League
1965 American League
1965 National League
1966 American League
1966 National League
1967 American League
1967 National League
1968 American League
1968 National League
1969 American League
1969 National League
1970 American League
1970 National League
1971 American League
1971 National League
1972 American League
1972 National League
1973 American League
1973 National League
1974 American League
1974 National League
1975 American League
1975 National League
1976 American League
1976 National League
1977 American League
1977 National League
1978 American League
1978 National League
1979 American League
1979 National League
1980 American League
1980 National League
1981 American League
1981 National League
1982 American League
1982 National League
1983 American League
1983 National League
1984 American League
1984 National League
1985 American League
1985 National League
1986 American League
1986 National League
1987 American League
1987 National League
1988 American League
1988 National League
1989 American League
1989 National League
1990 American League
1990 National League
1991 American League
1991 National League
1992 American League
1992 National League
1993 American League
1993 National League
1994 American League
1994 National League
1995 American League
1995 National League
1996 American League
1996 National League
1997 Major Leagues
1998 Major Leagues
1999 Major Leagues
2000 Major Leagues
2001 Major Leagues
2002 Major Leagues
2003 Major Leagues
2004 Major Leagues
2005 Major Leagues
2006 Major Leagues
end of file
Top of Page