2/28/12

Sabermetrics


By:  Mack Ade 

Can you write a book about baseball and not devote a chapter to sabermetrics?

Bill James, known as the father of sabermetrics which he started to formulate in the 70s through his series of Baseball Abstracts[i] essays. So much of what we today call sabermetrics is accredited to people like James, but did you know that the two people who invented the calculation for OBP, were Branch Rickey and Allen Roth, who worked for Rickey and the Brooklyn Dodgers in the 1950s?

Also, some purists accredit Johns Hopkins engineering professor Earnshaw Cook as being the saber-daddy. In 1964, Cook wrote Percentage Baseball[ii], which began as a thesis to prove that Ty Cobb was a better player than Babe Ruth. It also was the first baseball book to integrate advanced mathematics into baseball theory. Most of baseball thought Hopkins was bonkers and even James wrote later that it was his opinion that everything Hopkins had written was flawed.

If you want, we can go back to 1916 when someone named F.C. Lane invented a formula for determining a players' run value:

                   total run value= (.30*1B) + (.60*2B) + (.90*3B) + (1.15*HR)

Many feel this was the foundation on what people like Henry Chambers, Cook, Roth, and James develop over the years. Others feel they were just random math-heads that were looking for additional ways to evaluate the value of an individual baseball player.

James defined sabermetrics as "the search for objective knowledge about baseball. His first book, Baseball Abstract: Featuring 18 Categories of Statistical Information That You Just Can't Find Anywhere Else (1977) was copied on a mimeograph machine and hand stapled before mailing. You can find some of the early 1980s printings on EBAY, but the 70s printings are extremely rare. He wrote it while working as a night watchman.

The 1977 printing was dominated by monthly statistics on the 1976 pros.  Everyday player stats were games, at-bats, runs, hits, doubles, triples, home runs, stolen bases, and batting average. Pitchers were broken out monthly for wins, losses.

Other subjects covered in that classic publication were the separation of errors by throwing and catching, the number of home runs a pitcher throws per 100 innings, and which pitchers draw the most crowds.



It took a while, but, by the early 1980s Bill James was becoming a household word in the offices of baseball executives. He eventually was hired (2003) by the Boston Red Sox and continues to sit upon the sabermetrics throne.



I have mixed thoughts on all this.



Don’t get me wrong. I love statistics and I believe it is the only way you can separate the best from the rest. I just wonder if we’re all starting to go a little too far in this area.



My favorite is OPS. You add the on base percentage of a player with their slugging percentage. What a simply stroke of genius. But, do I really care how many walks a player had after having two strikes on them with the wind blowing at least 20MPH and the food vendor named Maury was taking a shit when the pitch was made?



There has to still be a place in this game for gut feeling. I hated the way the movie, Moneyball, made all the Oakland Athletics’ scouts look like some ancient mariners of the game. I know many of these scouts. They work very hard and are experts of the game. In addition, many are ex-professional baseball players that have played the game most (or all) of the sabermetric geeks only watched from afar.



I’m not writing this to tell stories, but I want to tell you about a ballplayer named  Maikel Cleto. He pitched for the Savannah Sand Gnats in 2008. A quiet kid, Cleto really didn’t do anything that special that year: 25-G, 22-ST, 5-11, 4.25, 135.2-IP, 81-K. He also pitched for one of the worst minor league teams ever assembled on the same field; however, if you were assigned to scout the team when he pitched, you learned very quickly that:



1.     He would sit 94-95 throughout his entire outing

2.     He had serious drop on the fastball as well

3.     He was an “innings eater” (averaged 6.4-IP per outing in A-ball)

4.     He was doing all this at 19-years old.



These simply weren’t things you would learn reading a stat-sheet somewhere in front of some computer. There was more.



I attended every Savannah home game that year and sat behind home plate with the scouts. As the season wore on, it became quite apparent that Cleto had a scout following. By the time he pitched his last game for Savannah (before being promoted to St. Lucie), there were well over double the amount running their guns while Maikel pitched.



You see, even if you couldn’t figure out on your own that this kid had a big ceiling, the traffic around him made you look twice.



Cleto wound up being packaged that December in the deal that brought J.J. Putz, Sean Green, and Jeremy Reed. Trust me. He never would have been in that deal without a first-hand report by a scout that sat with me. And yes, he made it to the majors in September 2011.



You’re just not going to see stuff like this on a stat sheet. Also, we always talk about recognizing a player for having talent. In truth, the job of a scout is to also recognize those that don’t have what it takes to make it all the way.



This is important. The percentage of minor league players that make it big in the pros is well below 10%. So, the major job of every scout is to identify the players that have no chance of making it. It is one thing to get credit for someone that plays big, but being the guy that was responsible for a team to trade for a bust, is a resume builder.



We’ll spend more time on this subject later.

1 comment:

Hobie said...

Much of sabermetrics reminds me of astrology: a lot of sophisticated mathematics to identify a situation or pattern, which, though interesting in itself—sometimes very interesting, is then applied subjectively as a predictive criterion.

The 1916 “total run value” is a case in point. It’s basically 3/10ths of total bases except the HR coefficient is 1.15 instead of 1.20. Why is the “run value” of a double twice (and a triple three times) the value of a single but a HR just shy of four? And why 3/10ths? Perhaps these coefficients applied to all the hits in 1915 yielded all the runs scored that year in mlb (congrats on that observation), so it just IS.

And if broken down to team-by-team stats, we find the coefficients need some tweaking, so we find a “park factor” to more closely correlate our new found metric with outcome. It just seems to be effect bringing about cause.