Player Win Averages for 1957-2006

Inspired by the book "Player Win Averages" (1970) by Harlan & Eldon Mills

[ Jeff's Sports Ratings | USA Today - Jeff Sagarin computer ratings ]

Email: Jeff Sagarin

All Contents Copyright © 2007 Jeff Sagarintm. All Rights Reserved.

Directory of 1957-2006 Seasons below
     My computer program implementing the Mills approach needs
     digitized play-by-play data of the games and so I would like to
     give a huge thanks to David W. Smith of Retrosheet
     for his tremendous interest and generousity.  Both of which were
     needed and greatly appreciated by me. And so I would like to officially say:
==================================================================================
     The play-by-play information used here was obtained free of
     charge from and is copyrighted by Retrosheet.  Interested
     parties may contact Retrosheet at "www.retrosheet.org".
Retrosheet
==================================================================================
EXPLANATION
When I was a 21-year old senior(math major) at MIT in the spring of 1970 I saw
a small ad in the back of the Sporting News and soon bought a little paperback book
entitled "Player Win Averages" by brothers Harlan Mills and Eldon Mills.
I instantly could see that they had gotten to the very heart of the
matter as to rating batters and pitchers.  And so it's always been in me since
then to implement the concepts.  I wrote my first version of baseball simulation
on the computer in the spring of 1980.  And of course have fiddled with it
over the years to make it better.  The other two books which have inspired me
in this vein were Earnshaw Cook's two books: "Percentage Baseball"(1964)
and "Percentage Baseball and the Computer"(1971) And finally,
the empirical research into probabilties of winning from different situations by
the Royal Canadian Air Force officers George Lindsey and his father Charles Lindsey
in the 1950s and early 1960s.

The Mills' concept is brilliant and elegantly simple: at each state in baseball game,
each team has a  certain probability of winning.  And then a play occurs and there is a new
state and associated probability of winning for each team.  By comparing the BEFORE and AFTER
probabilities and assigning the credit for the CHANGE to the batters and pitchers, and doing
this process for a whole season one can determine by how much each batter and pitcher has helped
his team win(and lose!) baseball games.

For example, in 1951, in the third and final game of the National League Playoffs between the
Brooklyn Dodgers and New York Giants, it was the bottom of the 9th and the Giants were trailing 4-2,
with men on 2nd and third, and one out.
Thus as Bobby Thomson stepped into the batters box, the chances were:
(based on 1951 Major League Composite Play - to determine these probabilities, one must use the
technique of computer simulation to have the computer play literally hundreds of millions of innings
for all possible situations, using the major league statistics for the given year.)

And then to rate the players, one needs to connect those situation probabilities
with a digitized play-by-play of all the games of a given season.

 _________________________________________________________________________________________
   9 bottom   HOME team MARGIN= -2     1 out(s)  2nd and 3rd    balls= 0   strikes= 0
 HOME WIN CHANCE= 0.27759   AWAY WIN CHANCE= 0.72241           HOME FIELD
 GAME STATUS=     -445  =  2000*HOME WIN CHANCE - 1000         HOME FIELD
 _________________________________________________________________________________________

and after Bobby Thomson hit his fabled homerun ... THE GIANTS WIN THE PENNANT! THE GIANTS WIN THE PENNANT!
the situation was as follows:
_________________________________________________________________________________________
   9 bottom   HOME team MARGIN=  1     1 out(s)  none on        balls= 0   strikes= 0
 HOME WIN CHANCE= 1.00000   AWAY WIN CHANCE= 0.00000           HOME FIELD
 GAME STATUS=     1000  =  2000*HOME WIN CHANCE - 1000         HOME FIELD
 _________________________________________________________________________________________

BEFORE: GAME STATUS=  -445
AFTER:  GAME STATUS= +1000
CHANGE:  1000 - (-445)  =  +1445 for Bobby Thomson,  -1445 for Ralph Branca

Thus the Giants chances changed from .27759 to 1.00000 or a jump of .72241 = 72.241%.
The Mills brothers liked to use a  -1000 to +1000 scale and so that is shown
as a GAME STATUS change from  -445 to +1000 or a gain of 1445 points which is precisely
2000 * .72241  rounded off to the nearest integer.  So in this case, Bobby Thomson
would be credited with  1445 WIN points and Ralph Branca (the pitcher) would be
credited with 1445 LOSS points.  Over the course of a season each batter and pitcher
will accumulate WIN and LOSS points from every play he was involved in.  This
also applies to a runner STEALING a base or being CAUGHT STEALING for example.
You simply compare the BEFORE and AFTER game statuses to determine the WIN and LOSS
points to give out.
And the Mills PWA(Player Win Average) is simply the final   WIN Points/(WIN Points + LOSS Points)
The scale factor of 2000 used is totally arbitrary and has no effect on the PWA
because it appears equally in the numerator and denominator and cancels out.

WIN Points =  probability change * SCALE
LOSS Points=  probability change * SCALE
and so you can see that  SCALE cancels out.

By the way, in their book, Harlan & Eldon Mills calculate a change of 1472 points.
If I were to have used the 1969 Major League Composite data - which is the database
they used - I would have gotten 1474 points.  The difference between my 1445 and
the 1472 and 1474 figures is due to the fact that scoring was a higher rate
in 1951 than in 1969;
4.57 runs per 9 innings per team in 1951, 4.09 runs per 9 innings per team in 1969.
Thus it was slightly less impressive for Thomson to have hit that homerun in 1951
than it would have been in 1969 given the situation he was batting in.

And to quote from pages 25-26 of the Mills' book regarding the PWA: "Here's something to keep in mind, and it also explains why we think this measurement system is equitable for the players. The players are not measured against any arbitrary standard. They are measured against their own teammates and opponents on how they performed this year. Over the year, using our new scorecard, we tabulate every play of every game. We know what actually happened - how many times each situation moved to each next situation. This gives us an average of what will happen on each next play, as actually performed by the players. So when we score each player against that average, we are really scoring him against his fellow players and opponents. The player who conforms to the average will have exactly the same number of Win and Loss Points (Net Points=0, js note), for a .500 Player Win Average. Those who are better than average will be above .500, and those who are less than average will be below .500, no matter what their batting average or earned run average may be. To illustrate, if it were a common, every-day occurrence for a player to hit a game-winning home run in the ninth, then those who did not would be below average. Since this is not the case, those who do not are not necessarily below average. Also, in a year when the hitters are big, and ten runs a game are commonplace, a player had better be up there getting his share, or he'll be below average. On the other hand, in a year like 1968, an average hitter needn't have done so much, since low scoring games were the rule. In other words, we do not measure players from one era against players from another. We measure them against their own teammates and opponents. But the statistic itself - Player Win Average - can be used to compare players of any era. That's because, in any era, whether the ball be dead or rabbit-like, a .500 ball player will be average, and a .570 player will be much better than average."

Directory of 1957-2006 Seasons
For the seasons of 1957-1973, there are some games missing.  But for 1974-2006 the data should be complete.
Just scroll to the end of a given file and you'll see the total number of games in the database.

You'll find my nomenclature in the files below to be quite similar to the Mills book.  That is my way
of honoring Harlan Mills and Eldon Mills for their evolutionary and revolutionary book.  They were
way ahead of their time and it's about time they got their due.  They were my inspiration to pursue
this topic. - Jeff Sagarin
  • 1957 American League
  • 1957 National League
  • 1958 American League
  • 1958 National League
  • 1959 American League
  • 1959 National League
  • 1960 American League
  • 1960 National League
  • 1961 American League
  • 1961 National League
  • 1962 American League
  • 1962 National League
  • 1963 American League
  • 1963 National League
  • 1964 American League
  • 1964 National League
  • 1965 American League
  • 1965 National League
  • 1966 American League
  • 1966 National League
  • 1967 American League
  • 1967 National League
  • 1968 American League
  • 1968 National League
  • 1969 American League
  • 1969 National League
  • 1970 American League
  • 1970 National League
  • 1971 American League
  • 1971 National League
  • 1972 American League
  • 1972 National League
  • 1973 American League
  • 1973 National League
  • 1974 American League
  • 1974 National League
  • 1975 American League
  • 1975 National League
  • 1976 American League
  • 1976 National League
  • 1977 American League
  • 1977 National League
  • 1978 American League
  • 1978 National League
  • 1979 American League
  • 1979 National League
  • 1980 American League
  • 1980 National League
  • 1981 American League
  • 1981 National League
  • 1982 American League
  • 1982 National League
  • 1983 American League
  • 1983 National League
  • 1984 American League
  • 1984 National League
  • 1985 American League
  • 1985 National League
  • 1986 American League
  • 1986 National League
  • 1987 American League
  • 1987 National League
  • 1988 American League
  • 1988 National League
  • 1989 American League
  • 1989 National League
  • 1990 American League
  • 1990 National League
  • 1991 American League
  • 1991 National League
  • 1992 American League
  • 1992 National League
  • 1993 American League
  • 1993 National League
  • 1994 American League
  • 1994 National League
  • 1995 American League
  • 1995 National League
  • 1996 American League
  • 1996 National League
  • 1997 Major Leagues
  • 1998 Major Leagues
  • 1999 Major Leagues
  • 2000 Major Leagues
  • 2001 Major Leagues
  • 2002 Major Leagues
  • 2003 Major Leagues
  • 2004 Major Leagues
  • 2005 Major Leagues
  • 2006 Major Leagues
  • end of file

    Top of Page