By: Mack Ade
Can you write a book about baseball and not devote a
chapter to sabermetrics?
Bill James, known as the father of sabermetrics which he started to
formulate in the 70s through his series of Baseball
Abstracts[i]
essays. So much of what we today call sabermetrics is accredited to people
like James, but did you know that the two people who invented the calculation
for OBP, were Branch
Rickey and Allen Roth, who worked for
Rickey and the Brooklyn Dodgers in the 1950s?
Also, some purists accredit Johns Hopkins
engineering professor Earnshaw Cook as
being the saber-daddy. In 1964, Cook wrote Percentage
Baseball[ii],
which began as a thesis to prove that Ty Cobb was
a better player than Babe Ruth. It also was the first baseball book
to integrate advanced mathematics into baseball theory. Most of baseball
thought Hopkins was bonkers and even James wrote later that it was his opinion
that everything Hopkins had written was flawed.
If you want, we can go back to 1916 when someone
named F.C. Lane invented a formula for determining a players' run value:
total
run value= (.30*1B) + (.60*2B) + (.90*3B) + (1.15*HR)
Many feel this was the foundation on what people
like Henry
Chambers, Cook, Roth, and James develop over the years. Others feel
they were just random math-heads that were looking for additional ways to
evaluate the value of an individual baseball player.
James defined sabermetrics
as "the search for objective knowledge about baseball. His first book, Baseball Abstract: Featuring 18 Categories
of Statistical Information That You Just Can't Find Anywhere Else (1977)
was copied on a mimeograph machine and hand stapled before mailing. You can
find some of the early 1980s printings on EBAY, but the 70s printings are
extremely rare. He wrote it while working as a night watchman.
The 1977 printing was dominated by monthly
statistics on the 1976 pros. Everyday
player stats were games, at-bats, runs, hits, doubles, triples, home runs,
stolen bases, and batting average. Pitchers were broken out monthly for wins,
losses.
Other
subjects covered in that classic publication were the separation of errors by
throwing and catching, the number of home runs a pitcher throws per 100
innings, and which pitchers draw the most crowds.
It
took a while, but, by the early 1980s Bill James was becoming a household word
in the offices of baseball executives. He eventually was hired (2003) by the
Boston Red Sox and continues to sit upon the sabermetrics throne.
I
have mixed thoughts on all this.
Don’t
get me wrong. I love statistics and I believe it is the only way you can
separate the best from the rest. I just wonder if we’re all starting to go a
little too far in this area.
My
favorite is OPS. You add the on base percentage of a player with their slugging
percentage. What a simply stroke of genius. But, do I really care how many
walks a player had after having two strikes on them with the wind blowing at
least 20MPH and the food vendor named Maury was taking a shit when the pitch
was made?
There
has to still be a place in this game for gut feeling. I hated the way the
movie, Moneyball, made all the
Oakland Athletics’ scouts look like some ancient mariners of the game. I know
many of these scouts. They work very hard and are experts of the game. In
addition, many are ex-professional baseball players that have played the game
most (or all) of the sabermetric
geeks only watched from afar.
I’m
not writing this to tell stories, but I want to tell you about a ballplayer named Maikel Cleto. He pitched for the Savannah Sand
Gnats in 2008. A quiet kid, Cleto really didn’t do anything that special that
year: 25-G, 22-ST, 5-11, 4.25, 135.2-IP, 81-K. He also pitched for one of the
worst minor league teams ever assembled on the same field; however, if you were
assigned to scout the team when he pitched, you learned very quickly that:
1. He
would sit 94-95 throughout his entire outing
2. He
had serious drop on the fastball as well
3. He
was an “innings eater” (averaged 6.4-IP per outing in A-ball)
4. He
was doing all this at 19-years old.
These
simply weren’t things you would learn reading a stat-sheet somewhere in front
of some computer. There was more.
I
attended every Savannah home game that year and sat behind home plate with the
scouts. As the season wore on, it became quite apparent that Cleto had a scout
following. By the time he pitched his last game for Savannah (before being
promoted to St. Lucie), there were well over double the amount running their
guns while Maikel pitched.
You
see, even if you couldn’t figure out on your own that this kid had a big
ceiling, the traffic around him made you look twice.
Cleto
wound up being packaged that December in the deal that brought J.J. Putz, Sean
Green, and Jeremy Reed. Trust me. He never would have
been in that deal without a first-hand report by a scout that sat with me. And
yes, he made it to the majors in September 2011.
You’re
just not going to see stuff like this on a stat sheet. Also, we always talk
about recognizing a player for having talent. In truth, the job of a scout is
to also recognize those that don’t have what it takes to make it all the way.
This
is important. The percentage of minor league players that make it big in the
pros is well below 10%. So, the major job of every scout is to identify the players
that have no chance of making it. It is one thing to get credit for someone
that plays big, but being the guy that was responsible for a team to trade for
a bust, is a resume builder.
We’ll
spend more time on this subject later.
Much of sabermetrics reminds me of astrology: a lot of sophisticated mathematics to identify a situation or pattern, which, though interesting in itself—sometimes very interesting, is then applied subjectively as a predictive criterion.
ReplyDeleteThe 1916 “total run value” is a case in point. It’s basically 3/10ths of total bases except the HR coefficient is 1.15 instead of 1.20. Why is the “run value” of a double twice (and a triple three times) the value of a single but a HR just shy of four? And why 3/10ths? Perhaps these coefficients applied to all the hits in 1915 yielded all the runs scored that year in mlb (congrats on that observation), so it just IS.
And if broken down to team-by-team stats, we find the coefficients need some tweaking, so we find a “park factor” to more closely correlate our new found metric with outcome. It just seems to be effect bringing about cause.