On the second-last Sunday of the season, the Rays’ Blake Snell threw at the Rogers Centre against the Blue Jays. The Blue Jays telecast was full of praise for Snell, who’s had a breakout season in 2018. He came in to his second-last start already with 20 wins, the first pitcher to reach that mark, and his ERA was a league-leading 1.97.
For the last couple of weeks, baseball observers were touting Snell as the new presumptive leader of the American League Cy Young race. Boston’s Chris Sale had been the frontrunner, but two stints on the disabled list meant he made only three starts between the end of the All-Star break and Sept. 10. Snell had usurped him in the ERA race and had those 20 wins to dazzle the eye.
Snell, though, might not be the complete candidate he appears. He had his own injury stint following the All-Star break, missing a couple of turns in the rotation. He also doesn’t pitch as deep into games as other top starters. Snell averages less than six innings per start and reached the six-inning mark in only five of his last 10 starts.
When it comes to the Cy Young Award, I have certain biases. I don’t believe relievers should win because they face significantly fewer batters than starters. I believe the stud starters who go deep into games should get the nod over those who compile a better ERA in fewer innings.
I was searching for a way to quantify my beliefs for this Cy Young season. In addition to the conundrum of the AL race, the National League race has a fascinating candidate with a deficit in a tradition stat. The Mets’ Jacob deGrom finished his season on Wednesday night with a masterful two-hit shutout over eight innings against Atlanta. He will finish with an MLB-best 1.70 ERA but only 10 wins because of his team’s weak offence and bullpen.
As an experiment, I applied Tom Tango’s Game Score 2.0 formula to the full-season stats of every pitcher in both leagues. Game Score is used to measure the single-game success of starting pitchers. But the formula favours the three true outcomes (strikeouts, walks and home runs) that form the basis of FIP calculations. The Game Score formula also credits pitchers for getting outs, limiting baserunners and runs scored.
The only alteration made the formula for this analysis was to remove the 40-point starting base used for single-game scoring. For this analysis, we are recording points only for the pitchers’ accumulated statistics: 2 points for every outs recorded, 1 point for every strikeout, -2 points for each walk and hit, -3 points for each run (earned and unearned) and -6 points for each home run allowed.
I got the 2018 individual pitching stats from Baseball-Reference.com and loaded them into R.
(Data updated through games of Monday, Oct. 1, 2018)
The altered game score formula is applied to create a new column for each pitcher called gScore, for lack of a better name.
# only Game Score components, no base score p18$gScore <- round(((2 * p18$IP * 3) + p18$SO - (2 * p18$BB) - (2 * p18$H) - (3 * p18$R) - (6 * p18$HR)), 0)
For reference, let’s check the max, mean and median of the gScore column.
##  971
##  113.0551
##  65
Here are the top 30 (and ties) among pitchers with 20 or more starts showing the gScore against the traditional W-L, ERA, strikeout stats. (I’ll keep updating these charts through the end of the season but here’s where we stand on Thursday morning.)
Top 30 MLB starters
cy2018 <- p18 %>% select(2, 4, 5, 11, 6, 7, 9, 16, 23, 36) %>% filter(GS > 20) %>% top_n(30, gScore) %>% arrange(-gScore) kable(cy2018, format = "html", table.attr = 'width="100%"') %>% kable_styling(position = "left")
*indicates left-handed pitcher
It’s probably no surprise that deGrom comes out on top. He’s among the league leaders in innings pitched and strikeouts. Despite having only 10 wins, he has a significant gScore lead over the Nationals’ Max Scherzer, the two-time defending NL Cy Young winner. Scherzer’s 300 strikeouts are offset by deGrom’s fewer runs allowed (48 for deGrom to 66 for Scherzer) and home runs allowed (only 10 all season to Scherzer’s 23).
DeGrom’s score of 971 is the third highest total in the last four seasons, bettered only by Clayton Kershaw (1,001 in 2015) and Jake Arrieta (998 in 2015).
The surprise is in the AL results. Snell finished sixth among all MLB pitchers and third in the AL. Snell passed Sale in the final week to push him to fourth. Cleveland’s Corey Kluber, who has also won 20 games now, is fifth in the AL. The runaway AL leader in gScore is the Astros’ Justin Verlander, followed by teammate Gerrit Cole.
Many observers seemed to rate Verlander as just the third or fourth best candidate in the AL race, although appears to have changed in the final week as he’s become the solid second favourite to Snell. Verlander gets nicked for the 28 home runs he has allowed this season, the second worst mark of his career (he allowed 30 HRs in 2016). But just like 2016, Verlander leads the AL in strikeouts. And like Kluber, he’s taken the ball every five days all season and pitched deep in to games with a low hits per nine innings rate.
Voters for the Cy Young Award traditionally favour power pitchers. Since the Cy Young came into being in 1959, a pitcher has won his league’s pitching triple crown (league leader in wins, ERA and strikeouts) 13 times and has been rewarded with the Cy Young every time. The last time it happened was 2011, in both leagues: Verlander with Detroit in the AL and the Dodgers’ Clayton Kershaw in the NL.
For fun, here’s a table of the top 30 relievers who have made 50 or more appearances.
Top 30 relievers
rel2018 <- p18 %>% select(2, 4, 5, 10, 6, 15, 9, 16, 23, 36) %>% filter(G > 50) %>% top_n(30, gScore) %>% arrange(-gScore) kable(rel2018, format = "html", table.attr = 'width="100%"') %>% kable_styling(position = "left")
How does this gScore measurement work for previous seasons? If we look at the last three seasons, the gScore matches up with four the six winners. It missed the 2016 AL race, won by Boston’s Rick Porcello in a bit of a voting controversy over Verlander, and the 2015 NL vote, won by the Cubs’ Jake Arrieta in a tight three-way race with Kershaw and Dodgers teammate Zack Greinke. The tables below show the pitchers’ traditional stats along with their gScore and where they placed in the Cy Young voting.
cy2017AL <- p17 %>% select(2, 4, 5, 37, 6, 7, 9, 23, 11, 36) %>% filter(Lg == "AL") %>% top_n(10, gScore) %>% arrange(-gScore) kable(cy2017AL, format = "html", table.attr = 'width="100%"') %>% kable_styling(position = "left")
cy2017NL <- p17 %>% select(2, 4, 5, 37, 6, 7, 9, 23, 11, 36) %>% filter(Lg == "NL") %>% top_n(10, gScore) %>% arrange(-gScore) kable(cy2017NL, format = "html", table.attr = 'width="100%"') %>% kable_styling(position = "left")
cy2016AL <- p16 %>% select(2, 4, 5, 37, 6, 7, 9, 23, 11, 36) %>% filter(Lg == "AL") %>% top_n(10, gScore) %>% arrange(-gScore) kable(cy2016AL, format = "html", table.attr = 'width="100%"') %>% kable_styling(position = "left")
cy2016NL <- p16 %>% select(2, 4, 5, 37, 6, 7, 9, 23, 11, 36) %>% filter(Lg == "NL") %>% top_n(10, gScore) %>% arrange(-gScore) kable(cy2016NL, format = "html", table.attr = 'width="100%"') %>% kable_styling(position = "left")
cy2015AL <- p15 %>% select(2, 4, 5, 37, 6, 7, 9, 23, 11, 36) %>% filter(Lg == "AL") %>% top_n(10, gScore) %>% arrange(-gScore) kable(cy2015AL, format = "html", table.attr = 'width="100%"') %>% kable_styling(position = "left")
cy2015NL <- p15 %>% select(2, 4, 5, 37, 6, 7, 9, 23, 11, 36) %>% filter(Lg == "NL") %>% top_n(10, gScore) %>% arrange(-gScore) kable(cy2015NL, format = "html", table.attr = 'width="100%"') %>% kable_styling(position = "left")
It’s really amazing to consider deGrom’s 2018 gScore when compared to the last three season. Since Kershaw, Arrieta and Greinke in 2015, no one had come close to cross 900 points again.
Given that deGrom only has 10 wins, let’s look at one more historical comparison. In 2010, the year that started to orient thinking away from slavishly considering wins and losses, the Mariners’ Felix Hernandez (13-12, 2.27) won the AL Cy Young vote despite his mediocre record. If we examine the gScores in the AL for that season, he compiled 860 points, 150 more than the Angels’ Jered Weaver (13-12, 3.01) and 168 more than Verlander (18-9, 3.37). David Price (19-6, 2.72), who was seventh in gScore points that season, was the Cy Young runner-up, and CC Sabathia (21-7, 3.18) was third with an gScore of only 661. Weaver, who led the league in strikeouts by one over Hernandez, finished fifth.
This application of the Game Score formula seems to do a good job of pushing forward the best candidates and correctly matched four of the last six Cy Young winners. It hews closely to my own biases so I like it, but it certainly won’t satisfy all fans. It likes power pitchers with lots of strikeouts and might not reward a groundball pitcher with fewer Ks. It completely eliminates relievers from Cy Young consideration. But that’s how I like it.
Based on the prevailing wisdom, this measurement will probably split this year’s Cy Young races, correctly predicting deGrom and missing out on Snell. You can compare this gScore with other measures like ESPN’s Cy Young predictor formula and Tango’s predictor that takes wins into account The BBWAA votes have to be in shortly after the end of the regular season. The results will be announced in November.