Pitcher Workloads

Robert Dvorchak wrote about the sharp rise in teenage Tommy John type ligament surgeries for the March 6, 2005 Pittsburgh Post-Gazette. He describes a scene at the office of Dr. James Andrews of Birmingham, Alabama - the famous practitioner of this procedure.

When the parents of a young pitcher with a damaged arm visit his clinic, Andrews has them write on a chalkboard when the child started pitching, how many teams he's played for, what baseball camps he attended, how much extra throwing he does in the back yard and other items pertinent to his pitching background. In extreme cases, he found that youngsters are taking off only two weeks a year -- Thanksgiving and Christmas – while concentrating exclusively on baseball.

"I point to that blackboard and tell them why their child is seeing me for surgery. That's when the light goes on for them," Andrews said. "No 13-year-old should have that kind of wear and tear on his arm."

Among his laundry list of recommendations to young pitchers are don't throw a breaking pitch until you shave and don't try to push the speed readings on the radar gun. "I think the radar gun ought to be eliminated in high school," Andrews said.

Can we know what is a safe annual workload for a pitcher? There is still no definitive answer. Experts in baseball statistics continue to come up with what they believe are useable rules of thumb and we think we are on to something. However, exceptions are common. Some of the conclusions just don’t make complete sense. Other baseball statisticians point to the flaws in the studies. Can we even agree that there is such a thing as a too heavy workload for some pitchers? – or that younger pitchers are more vulnerable to overuse than those in their prime baseball years?

The work of Baseball Prospectus writers Jany Jazayerli and Keith Woolner on Pitcher Abuse Points (PAP – the wear and tear caused by high pitch counts) cannot be dismissed just because in 2004’s The Neyer/James Guide to Pitchers the godfather of baseball stats Bill James conducted 8 studies which blasted away at their theory. James used 7 different control groups. Each control group was strongly outperformed by the pitchers identified as most abused! However, James used an old version of PAP that Jazayerli and Woolner admitted was wrong. They feel they have made much better tests on their newest PAP formula. They claim and I see their points that there were fallacies in James’ control groups despite his many attempts to perfect them. They make the point, which I had forgotten, but was going to make here anyway: based on the way pitchers are used nowadays, the MLB community appears to have heeded their warning. However, high pitch count abuse was dropping steadily before Jazayerli reported his study in BP around 1999. Baseball Prospectus has PAP data going back to 1988. Twenty years ago, there were 60 pitchers with over 100,000 Pitcher Abuse Points for the season. In 2007, that number dwindled down to two. The seasons with the steepest drops in PAP measured abuse among the 30 most abused pitchers in Major League baseball were 1990 and 2001, although the last three years have had about as large a three year decline in such abuse as any measured. Have pitching injuries been reduced over that time? If not, there could be other explanations than PAP doesn’t matter.

Incidentally, those two pitchers with over 100K PAP in 2007 were Daisuke Matsuzaka and Carlos Zambrano. This season it was just Tim Lincecum and C.C. Sabathia. Number three wasn’t even close. Roy Halladay had only 78,018 PAP.

Tom Verducci of Sports Illustrated came up with the Year-After Effect which stated that any pitcher 25 years old or younger who exceeds any of his previous years’ workloads by more than 40 innings can be expected to be headed for a downturn in his career. His March 04, 2003 article for SI.com stated:

The Year-After Effect is based on a rule of thumb, not exact science. Body type, pitch counts, physical maturity, run support and other elements are important components of the growth and evaluation of young pitchers. But as a rule of thumb, the Year-After Effect should grab the attention of Kansas City and all organizations. Why? Over the previous three years I identified 16 pitchers as high risks. Of those 16 pitchers:

 15 wound up with worse ERAs

 13 won fewer games

 12 threw fewer innings

 seven were put on the disabled list

 their major league win total dropped an average of 30 percent

Verducci doesn’t mention any control group in the study. Why wouldn’t a group of pitchers who suddenly improved enough to increase their workload so heavily be expected to regress the following year? Regressing this overwhelmingly does seem more than what you would expect.

Bill James gives us a clue of how much all pitchers normally declined over the course of their career in that Neyer/James Guide to Pitchers article that tested Jazayerli & Woolner’s PAP system. In terms of Win Shares, he found these average declines in pitchers of four different Win Share levels at four different ages:

Age 24 Age 27 Age 30 Age 33

20 Win Shares 11% 41% 43% 14%

16 Win Shares 35% 40% 24% 23%

12 Win Shares 23% 22% 5% 32%

8 Win Shares 6% 18% 24% 30%

Verducci has since reduced red flag threshold to 30 innings. Is that a meaningful number? Phil Birnbaum in Sabermetric Research asks “…if the starter set a new high, he's probably pitching well, which means he's giving up fewer hits, which means that he's not throwing as many more pitches as the 30 inning figure would suggest.”

Without saying so explicitly that I could find, Verducci seems to have moved away from comparing the new output with the previous pro high towards just the previous season.

Will Carroll of Baseball Prospectus has refined Verducci’s rule of thumb even further (and renamed it “the Verducci Rule”). One of the refinements he makes is taking the minor league innings out of the equation. In a guest column of the LoHud Yankees Blog by Peter Abraham in January of 2008, Carroll proposes an explanation, but practically admits it might be hogwash.

Why are minor league innings any different than major league innings? There are only theories, but the best and most testable center around a selection bias. A pitcher good enough to go over 100 innings in the major leagues is almost by definition a quality pitcher. We know that major league hitters are harder to get out than minor league hitters, not to mention the stress of pitching in front of big crowds. The type of pitcher that can get over 100 innings in the majors is likely to be coasting through the minors on less than his best effort. He’s seldom taxed. He’s seldom forced to bear down or throw long innings. Granted, we don’t know this is the reason why and mathematically and physiologically, it shouldn’t be the case, but until someone can develop a working model for translation, we have to simply ignore those minor league innings.

If you ignore Minor League Innings, how does any pitcher work his way up to 200 innings without passing the Verducci threshold? That would take seven years. He would be a free agent before he ever gave his drafting team a 200 inning season. There must a qualification to Carroll’s method that I am not aware of. Nor do I know the route Carroll took to arrive at the conclusion that we are better off ignoring those Minor League innings. I’m sure it was logical and methodical. Yet, it is so counter-intuitive, it needs to be confirmed by an independent study.

Another controversial aspect of the Verducci Rule is the age at which a pitcher suddenly becomes less vulnerable to a large increase in workload. Is it 25 as originally proposed by Verducci or is it 23? This Baseball Prospectus study by Nate Silver and Will Carroll shows that pitchers suddenly become far less prone to injury at 23 years old.

What we really want to know is the optimal way of building a productive starting pitcher. The best way to solve these mysteries would be to find similar pitchers who were handled differently and seeing which paths had more success. Unfortunately, identifying such a controlled grouping of pitchers is probably impossible. No two pitchers are alike and there are significant details about each pitcher that I have little or no idea about – such as the smoothness and efficiency of their delivery, the amount of pitching they did before they went professional, how they exercised, and what they ate and drank. Those things likely matter more than whether they increased their Major League pitching load by 35 innings instead of 25 innings one year. Just the fact that two nearly identical pitchers were pushed at different paces could indicate that one was deemed better suited to handle the extra workload due to some of these circumstances unknowable to us.

Undaunted, I set out to see what I could see equipped with an Excel spreadsheet and Windows Explorer tabs on Baseball Reference, Baseball Cube, FanGraphs, and a Body Mass Index Calculator from the National Heart Lung and Blood Institute. At my side was the Neyer/James Guide to Pitchers and a library of STATS Notebooks and Sickels’ Baseball Prospect Books just to check each pitcher’s favourite weapons. Although, Will Carroll himself kindly advised me to narrow my search, there were many burning inter-related issues to explore. I could not resist approaching them together.

One contribution I set out to provide is an improvement in the data set selection. Most studies start out with the pitchers who appeared on a most abused list. Others studied pitchers selected from many decades ago when pitchers were used differently and our training knowledge was more primitive. Another BP study began with active pitchers listed in a 1997 almanac who had accumulated 175 innings by the age of 22. That is mixing in almost all the pitchers who started their major league careers at a young age in the mid 90s with only the most successful pitchers who started their careers in earlier times. Furthermore, 175 innings by 22 would rule out the university guys and the late bloomers. We want to know if we can identify any pitcher who has been overworked. I looked at all the pitchers who pitched their first 162 inning season in any year from 1990-2005. 162 innings was chosen because I wanted to include all the pitchers who were given a real shot at making it as a starter. 162 innings happens to be the ERA qualifying amount since expansion began (except for strike years). I started in 1990, because too much about pitching has changed since the 80s. I ended in 2005, because we need some year-after results to measure.

Among other things, I looked at three different ways of measuring a pitcher’s largest jumps in innings pitched and the gain or decline in usage after those jumps. The first way is using the original Verducci method of comparing total innings from both Minor & Major Leagues and comparing them to any previous season. Secondly, I looked at the more refined Carroll method of only comparing Major League innings. Thirdly, I made my own rough rule of combining MLB and Minor League innings, but only counting largest jumps at predominantly the Major League level. I further refined the Carroll’s and my methods by comparing one year’s inning total to the previous year’s innings total if that was the pitcher’s previous high of the last three years. If it wasn’t, I use the average of that previous year’s innings total with the higher innings total of the previous two seasons. You may feel that is unfair to take this liberty with Carroll’s method, but a) I don’t know the precise details of his method and b) I can’t imagine how it could be a significantly objectionable adjustment.

Note that I am comparing the same set of pitchers – not just the pitchers who had a particular level of abuse by one measuring method or the other. Hence, neither method is getting more talented or more durable pitchers than the other. Obviously, though, where the results differ will be with pitchers at different stages of their careers.

So far, I have only scrounged up the time to gather data for qualifying pitchers whose last names begin with “A”, “B”, or “C”. What can these 53 pitchers tell us so far? By just entering in the numbers by hand, I noticed that the Carter method of measuring jumps in usage correlated to a precipitous decline in subsequent usage much stronger than the Carroll method. Verducci’s original Year-After method falls somewhere in-between. Summarizing the data for these pitchers bears this out.

162+ IP in any season ‘90-‘05

Abuse Avg. Avg. Yr. +1 Yr. +2

Measuring Age. Max Ing. Inning Inning

Method MaxAb Increase Increase Increase

Carter (ignore max >3 yrs. old; MLB level)

25.6

51

-42.4

-66.7

Carroll ( “ “ “ “ & MLB innings only)

24.5

123

-12.7

-30.0

Verducci (compared to prev. career max)

22.8

51

-33.4

-20.3

I made all sorts of conjecture from this data. I refined the comparisons separating the more highly prized pitchers from the fellows who were happy to crack a rotation (based on draft round, Baseball America Top 100 ranking and K:BB – HR9 in their first 162 inning season). I separated the pitchers into age groups. I separated them into degrees of abuse. However, all the comparisons involving these three variations of the Verducci Rule have an underlying fault. The data selected has a natural bias against the Verducci method and in favour of the Carter method. That is because, the selection of pitchers for this study is based on having a 162 innings season in the Majors and the Carter requires a predominantly Major League season, while the Carroll method requires Major League innings.

As you can see from the Average Age of Max abuse, the Carter method generally comes a year after they pitch enough in the Majors to be tallied for the Carroll method. That is mostly because starters are frequently first called up at some point in the middle of a season – if not broken in as relievers. They may pitch somewhere between 80 and 150 MLB innings that first season. They might even pitch 60, go back to the minors and before returning to pitch 110 in the majors the following year, then 160+ in year three. In any case, after doing all the studies, thinking about them, and writing about them, I realized the reason the Yr. +1 Inning Increase with the Carroll method is low is because it will so frequently be followed by that 162 inning season that qualified them for this study.

The Verducci method is frequently a minor league season. Hence, that first 162 IP season is often even further into the future. Hence, it is the one method where the decline in second year is less pronounced. Let’s call their first 162 IP season Y1. In fact, only one of the 53 pitchers had a maximum increase after his Y1. Bronson Arroyo’s logged an up-till-then career high 179 IP in his Y1 – his first full season in the majors. The next year he jumped to 205 innings then almost 241 innings in Y1 + 2. Previous to Y1, Arroyo’s highest inning total in the Majors was 88, so Y1 is his max increase by Carroll’s method of ignoring minor league innings. His usage greatly increased the two seasons which followed. However, his max increase when you include minor league innings was two years later. That season was followed by a decrease in usage (and performance) to 211 and 200 innings. Of course, a sample of one does not make a study.

By eliminating pitchers who had their largest Carroll jump before their Y1, we can, at least, compare the Carroll and Carter methods. There were nine pitchers who had their largest Carroll jump on or after their Y1 and on a different year than as measured using the Carter method. All nine did worse the year after their Carter measured jump than after the Carroll measured jump. Eight of the nine did even worse relative to Carroll’s method two years later. Combining those nine with all the others who had their max jumps after Y1 we get:

pitchers w/ max jump > Y1

Abuse Avg. Avg. Yr. +1 Yr. +2

Measuring Age. Max Ing. Inning Inning

Method MaxAb Increase Increase Increase

Carter (ignore max >3 yrs. old; MLB level)

25.4

50

-37.8

-64.7

Carroll ( “ “ “ “ & MLB innings only)

24.5

132

-20.0

-38.7

At first glance these numbers are roughly in line with the normal James’ Win Share decline chart I showed earlier. This puts the entire year-after effect into question.

Age 24 Age 27 Age 30 Age 33

20 Win Shares 11% 41% 43% 14%

16 Win Shares 35% 40% 24% 23%

12 Win Shares 23% 22% 5% 32%

8 Win Shares 6% 18% 24% 30%

The average number of innings pitched in Y1 of my test group is 194. Typically, a 194 inning pitcher will rack up about 9 or 10 Win Shares. However, many of those 194 innings includes minor league innings. It is much more reasonable to estimate that these pitchers averaged about 8 Win Shares in their year of greatest abuse. Interpolating James’ chart at the ages of their highest abuse according to our methods (24-26) we should expect about 6-14% decline amongst such a group of pitchers. Converting my data to align with this chart, the Carter method of measuring abuse showed a 19½% decrease in these pitchers during the first year after their max abuse, then a 33% decrease in the 2^nd year after. The Carroll method showed just what you would expect from any 8 Win Share pitcher the year after his age 24 or 25 season (not just abused ones) – a 10% decline, then a 20% decline in the second year. There you have my evidence that minor league innings do matter in measuring abuse by yearly IP increases.

What about the age nexus? Carroll and Verducci only look at younger pitchers, while I don’t look at any cut-off age. Most of the maximum seasonal inning increases in this study occurred when the pitcher was 23 years or older. As mentioned before, Silver & Carroll suggest that a pitcher’s injury nexus occurs before the age of 23. Tom Verducci only red flags pitchers 25 and under. If I use the well researched 22 and under cut off, my list, so far, would include only four pitchers (Rick Ankiel, Kevin Appier, Steve Avery and Mark Buehrle) who had career high jumps by all three measures before the age of 23. Extending the cut-off age to 23 allows us to add 7 more arms – hopefully enough to give the data some significance: Wilson Alvarez, Andy Benes, Jeremy Bonderman, Bartolo Colon, Steve Cooke, and Nate Cornejo.

    13 youngest pitchers in study

Abuse                                        Avg.       Avg.        Yr. +1     Yr. +2

Measuring                                 Age.      Max Ing.   Inning    Inning

Method                                    MaxAb Increase Increase   Increase

Carter (ignore max >3 yrs. old; MLB level)

22.4

48

-32.5

-65.0

Carroll ( “ “ “ “ & MLB innings only)

22.0

142

-23.0

-44.0

As we expected, the year-after effects are stronger across the board. However, the increases in decline are so miniscule that I am not convinced it is significant – certainly not with these few pitchers. Rationalizing this result, injuries and decline from abuse are likely off-set by the increases in productivity that healthy pitchers this age normally have. Perhaps, younger pitchers recover from their injuries more strongly as well.

What if we go back to the larger test group; then narrow the list down to pitchers who have not had a large jump in innings? None of the pitchers in the “young group” were spared a jump of less than 40 innings at some point in their career – and that’s not even considering Carroll’s MLB IP only method. In Verducci/Carter jumps, Appier (40/37) and Bonderman (32/45) were the young pitchers closest to having no large jumps. Verducci and Carroll both raise the red flag at jumps of 30 or more innings. The pitchers in the original larger group who never experienced a Verducci/Carter jump of more than 30 innings are:

Rene Arocha – converted to relief

Joe Blanton – held to a max increase of 29 innings as both a 23 year old to no ill effect and as a 26 year old after which he pitched 15 fewer innings. That was this past season where he struggled mid season with Oakland, but revived himself in Philadelphia pitching effectively all the way through the World Series.

Ricky Bones – appears to have had a relatively healthy career. Except for one lucky season, which actually included a decrease in innings from his previously established level, he just wasn’t a very good pitcher.

That’s it! Depending on how far off my estimates are of various pitchers’ amateur workloads, these three might also qualify:

Rick Ankiel – after 7 Midwest League starts was promoted to the high A Carolina League for 21 more starts logging 161 innings as a tender young 18 year old super prospect. Sure, he is Floridian and may well have pitched more than the estimated 125 innings in a previous year, but 18 is young to push a pitcher, isn’t it? You know his story, in the Majors by the end of the following season, 175 innings of 3.50 ERA in his first full season, then three appearances in the play-offs on top of that. Voila: severe arm troubles, Tommy John surgery, etc., and eventually a new life as a starting MLB outfielder!

Jason Bere was a year and three months older than Ankiel when he pitched his first year in pro ball – which consisted of a similar 27 starts and 163 innings. He attended a Community College after high school in his native state of Massachusetts. He may not have been established himself at the 125 inning level as teenager. Early in Bere’s third pro season he established himself as a fine starter and continued that success into his first full MLB season. After that, he struggled to stay in the majors for the rest of his career.

Rheal Cormier pitched 170 innings his first professional season. Considering Cormier is from New Brunswick (eastern Canada) and came south to pitch for Community College of Rhode Island, 125 innings could be a very generous estimate of his previous annual workload. Otherwise, it is worth noting that Cormier’s highest jump in innings came at age 29 – a mere 30 innings. He missed almost all of the next season, but the left-hander came back as a reliever and lasted in the Majors until he was 40.

max increase < 30 IP

Abuse Avg. Avg. Yr. +1 Yr. +2

Measuring Age. Max Ing. Inning Inning

Method MaxAb Increase Increase Increase

Carter (ignore max >3 yrs. old; MLB level)

24.2

25

-55.8

-58.4

There was no point in including my interpretation of Carroll’s method in this box of the least abused pitchers. The young pitcher with the smallest max jump in Major League innings only was Brian Bohanon with a jump of 49 innings.

I’m not sure what to make of this particular test. Unexpectedly, the year-after effect increased in the first year. Let’s chalk this up to a small sample size. Rheal Cormier missed almost the entire season after his maximum jump. If you have any other suggestions besides “keep trying with more data”, please let me know.

Conclusion: My tests here indicate Minor League IP do indeed count when assessing a pitcher’s likely decline in innings over the following two seasons based on a large increase in innings. Before I continue with the numerous tests regarding optimal starting pitcher usage, I would like to settle this point. Hence, I will pause in my study here until I hear from Will Carroll, Nate Silver, or anyone in the Sabermetric community who reads this.

In the meantime, I found no strong indication of a magical 30+ inning jump or age 23 threshold, but more pitchers need to be studied. A younger pitcher’s greater fragility may be off-set by his greater natural progress and/or his greater ability to recover from injury.

This entire theory of identifying likely abused pitchers by a large jump in innings may be exaggerated. It may have been a considerable phenomena in the past, but the in the pitchers I have looked at so far since 1990, it is not shouting. Perhaps, most Major League teams these days are already leery of overworking their pitchers in all manners. They do not let it happen to pitchers they feel cannot handle the additional workload.