15 min read

A consideration of the 2018 Cy Young races

On the second-last Sunday of the season, the Rays’ Blake Snell threw at the Rogers Centre against the Blue Jays. The Blue Jays telecast was full of praise for Snell, who’s had a breakout season in 2018. He came in to his second-last start already with 20 wins, the first pitcher to reach that mark, and his ERA was a league-leading 1.97.

For the last couple of weeks, baseball observers were touting Snell as the new presumptive leader of the American League Cy Young race. Boston’s Chris Sale had been the frontrunner, but two stints on the disabled list meant he made only three starts between the end of the All-Star break and Sept. 10. Snell had usurped him in the ERA race and had those 20 wins to dazzle the eye.

Snell, though, might not be the complete candidate he appears. He had his own injury stint following the All-Star break, missing a couple of turns in the rotation. He also doesn’t pitch as deep into games as other top starters. Snell averages less than six innings per start and reached the six-inning mark in only five of his last 10 starts.

When it comes to the Cy Young Award, I have certain biases. I don’t believe relievers should win because they face significantly fewer batters than starters. I believe the stud starters who go deep into games should get the nod over those who compile a better ERA in fewer innings.

I was searching for a way to quantify my beliefs for this Cy Young season. In addition to the conundrum of the AL race, the National League race has a fascinating candidate with a deficit in a tradition stat. The Mets’ Jacob deGrom finished his season on Wednesday night with a masterful two-hit shutout over eight innings against Atlanta. He will finish with an MLB-best 1.70 ERA but only 10 wins because of his team’s weak offence and bullpen.

As an experiment, I applied Tom Tango’s Game Score 2.0 formula to the full-season stats of every pitcher in both leagues. Game Score is used to measure the single-game success of starting pitchers. But the formula favours the three true outcomes (strikeouts, walks and home runs) that form the basis of FIP calculations. The Game Score formula also credits pitchers for getting outs, limiting baserunners and runs scored.

The only alteration made the formula for this analysis was to remove the 40-point starting base used for single-game scoring. For this analysis, we are recording points only for the pitchers’ accumulated statistics: 2 points for every outs recorded, 1 point for every strikeout, -2 points for each walk and hit, -3 points for each run (earned and unearned) and -6 points for each home run allowed.

I got the 2018 individual pitching stats from Baseball-Reference.com and loaded them into R.

(Data updated through games of Monday, Oct. 1, 2018)

The altered game score formula is applied to create a new column for each pitcher called gScore, for lack of a better name.

# only Game Score components, no base score
p18$gScore <- round(((2 * p18$IP * 3) + p18$SO - 
  (2 * p18$BB) - (2 * p18$H) - (3 * p18$R) - (6 * p18$HR)), 0)

For reference, let’s check the max, mean and median of the gScore column.

max(p18$gScore)
## [1] 971
mean(p18$gScore)
## [1] 113.0551
median(p18$gScore)
## [1] 65

Here are the top 30 (and ties) among pitchers with 20 or more starts showing the gScore against the traditional W-L, ERA, strikeout stats. (I’ll keep updating these charts through the end of the season but here’s where we stand on Thursday morning.)

Top 30 MLB starters

cy2018 <- p18 %>% select(2, 4, 5, 11, 6, 7, 9, 16, 23, 36) %>% 
  filter(GS > 20) %>% 
  top_n(30, gScore) %>% 
  arrange(-gScore)

kable(cy2018, format = "html", table.attr = 'width="100%"') %>% kable_styling(position = "left")
Name Tm Lg GS W L ERA IP SO gScore
Jacob deGrom NYM NL 32 10 9 1.70 217.00 269 971
Max Scherzer WSN NL 33 18 7 2.53 220.67 300 886
Justin Verlander HOU AL 34 16 9 2.52 214.00 290 831
Aaron Nola PHI NL 33 17 6 2.37 212.33 224 811
Gerrit Cole HOU AL 32 15 5 2.88 200.33 276 746
Blake Snell* TBR AL 31 21 5 1.89 180.67 221 734
Chris Sale* BOS AL 27 12 4 2.11 158.00 237 730
Patrick Corbin* ARI NL 33 11 7 3.15 200.00 246 726
Corey Kluber CLE AL 33 20 7 2.89 215.00 222 711
Trevor Bauer CLE AL 27 12 6 2.21 175.33 221 684
Miles Mikolas STL NL 32 18 4 2.83 200.67 146 614
Mike Foltynewicz ATL NL 31 13 10 2.85 183.00 202 607
Mike Clevinger CLE AL 32 13 8 3.02 200.00 207 606
Zack Greinke ARI NL 33 15 11 3.21 207.67 199 598
Carlos Carrasco CLE AL 30 17 10 3.38 192.00 231 591
Kyle Freeland* COL NL 33 17 7 2.85 202.33 173 589
Luis Severino NYY AL 32 19 8 3.39 191.33 220 588
Zack Wheeler NYM NL 29 12 7 3.31 182.33 179 572
Jameson Taillon PIT NL 32 14 10 3.20 191.00 179 548
Kyle Hendricks CHC NL 33 14 11 3.44 199.00 161 521
Clayton Kershaw* LAD NL 26 9 5 2.73 161.33 155 520
German Marquez COL NL 33 14 11 3.77 196.00 230 520
Charlie Morton HOU AL 30 15 3 3.13 167.00 201 518
Jose Berrios MIN AL 32 12 11 3.84 192.33 202 517
Walker Buehler LAD NL 23 8 5 2.62 137.33 151 510
Jhoulys Chacin MIL NL 35 15 8 3.50 192.67 156 507
Noah Syndergaard NYM NL 25 13 4 3.03 154.33 155 488
James Paxton* SEA AL 28 11 6 3.76 160.33 208 479
Trevor Williams PIT NL 31 14 10 3.11 170.67 126 466
Dallas Keuchel* HOU AL 34 12 11 3.74 204.67 153 459

*indicates left-handed pitcher

It’s probably no surprise that deGrom comes out on top. He’s among the league leaders in innings pitched and strikeouts. Despite having only 10 wins, he has a significant gScore lead over the Nationals’ Max Scherzer, the two-time defending NL Cy Young winner. Scherzer’s 300 strikeouts are offset by deGrom’s fewer runs allowed (48 for deGrom to 66 for Scherzer) and home runs allowed (only 10 all season to Scherzer’s 23).

DeGrom’s score of 971 is the third highest total in the last four seasons, bettered only by Clayton Kershaw (1,001 in 2015) and Jake Arrieta (998 in 2015).

The surprise is in the AL results. Snell finished sixth among all MLB pitchers and third in the AL. Snell passed Sale in the final week to push him to fourth. Cleveland’s Corey Kluber, who has also won 20 games now, is fifth in the AL. The runaway AL leader in gScore is the Astros’ Justin Verlander, followed by teammate Gerrit Cole.

Many observers seemed to rate Verlander as just the third or fourth best candidate in the AL race, although appears to have changed in the final week as he’s become the solid second favourite to Snell. Verlander gets nicked for the 28 home runs he has allowed this season, the second worst mark of his career (he allowed 30 HRs in 2016). But just like 2016, Verlander leads the AL in strikeouts. And like Kluber, he’s taken the ball every five days all season and pitched deep in to games with a low hits per nine innings rate.

Voters for the Cy Young Award traditionally favour power pitchers. Since the Cy Young came into being in 1959, a pitcher has won his league’s pitching triple crown (league leader in wins, ERA and strikeouts) 13 times and has been rewarded with the Cy Young every time. The last time it happened was 2011, in both leagues: Verlander with Detroit in the AL and the Dodgers’ Clayton Kershaw in the NL.

For fun, here’s a table of the top 30 relievers who have made 50 or more appearances.

Top 30 relievers

rel2018 <- p18 %>% select(2, 4, 5, 10, 6, 15, 9, 16, 23, 36) %>% 
  filter(G > 50) %>% 
  top_n(30, gScore) %>% 
  arrange(-gScore)

kable(rel2018, format = "html", table.attr = 'width="100%"') %>% kable_styling(position = "left")
Name Tm Lg G W SV ERA IP SO gScore
Blake Treinen OAK AL 68 9 38 0.78 80.33 100 400
Josh Hader* MIL NL 55 6 12 2.43 81.33 143 376
Edwin Diaz SEA AL 73 0 57 1.96 73.33 124 367
Jeremy Jeffress MIL NL 73 8 15 1.29 76.67 89 331
Seth Lugo NYM NL 54 3 3 2.66 101.33 103 331
Adam Ottavino COL NL 75 6 6 2.43 77.67 112 319
Collin McHugh HOU AL 58 6 0 1.99 72.33 94 306
Craig Stammen SDP NL 73 8 0 2.73 79.00 88 305
Jesse Chavez TOT MLB 62 5 5 2.55 95.33 92 300
Jared Hughes CIN NL 72 4 7 1.94 78.67 59 296
Jose Leclerc TEX AL 59 2 12 1.56 57.67 85 279
Taylor Rogers* MIN AL 72 1 2 2.63 68.33 75 277
Yusmeiro Petit OAK AL 74 7 0 3.00 93.00 76 272
Kirby Yates SDP NL 65 5 12 2.14 63.00 90 271
Chad Green NYY AL 63 8 0 2.50 75.67 94 270
Ryan Pressly TOT AL 77 2 2 2.54 71.00 101 270
Richard Rodriguez PIT NL 63 4 0 2.47 69.33 88 269
Dellin Betances NYY AL 66 4 4 2.70 66.67 115 267
Steve Cishek CHC NL 80 4 4 2.18 70.33 78 267
Jose Alvarado* TBR AL 70 1 8 2.39 64.00 80 253
Tony Watson* SFG NL 72 4 0 2.59 66.00 72 251
Craig Kimbrel BOS AL 63 5 42 2.74 62.33 96 247
Brad Hand* TOT MLB 69 2 32 2.75 72.00 106 246
Jeurys Familia TOT MLB 70 8 18 3.13 72.00 83 243
Reyes Moronta SFG NL 69 5 1 2.49 65.00 79 243
Seunghwan Oh TOT MLB 73 6 3 2.63 68.33 79 240
Felipe Vazquez* PIT NL 70 4 37 2.70 70.00 89 239
Lou Trivino OAK AL 69 8 4 2.92 74.00 82 238
Aroldis Chapman* NYY AL 55 3 32 2.45 51.33 93 236
Seranthony Dominguez PHI NL 53 2 16 2.95 58.00 74 233
David Robertson NYY AL 69 8 5 3.23 69.67 91 233

How does this gScore measurement work for previous seasons? If we look at the last three seasons, the gScore matches up with four the six winners. It missed the 2016 AL race, won by Boston’s Rick Porcello in a bit of a voting controversy over Verlander, and the 2015 NL vote, won by the Cubs’ Jake Arrieta in a tight three-way race with Kershaw and Dodgers teammate Zack Greinke. The tables below show the pitchers’ traditional stats along with their gScore and where they placed in the Cy Young voting.

2017 AL

cy2017AL <- p17 %>% select(2, 4, 5, 37, 6, 7, 9, 23, 11, 36) %>% 
  filter(Lg == "AL") %>% 
  top_n(10, gScore) %>% 
  arrange(-gScore)

kable(cy2017AL, format = "html", table.attr = 'width="100%"') %>% kable_styling(position = "left")
Name Tm Lg CyVote W L ERA SO GS gScore
Corey Kluber CLE AL 1 18 4 2.25 265 29 839
Chris Sale BOS AL 2 17 8 2.90 308 32 815
Carlos Carrasco CLE AL 4 18 6 3.29 226 32 643
Luis Severino NYY AL 3 14 6 2.98 230 31 643
Justin Verlander TOT AL 5 15 8 3.36 219 33 569
Ervin Santana MIN AL 7 16 8 3.28 167 33 518
Chris Archer TBR AL NA 10 12 4.07 249 34 484
James Paxton SEA AL NA 12 5 2.98 156 24 477
Marcus Stroman TOR AL 8 13 9 3.09 164 33 472
Brad Peacock HOU AL NA 13 2 3.00 161 21 441

2017 NL

cy2017NL <- p17 %>% select(2, 4, 5, 37, 6, 7, 9, 23, 11, 36) %>% 
  filter(Lg == "NL") %>% 
  top_n(10, gScore) %>% 
  arrange(-gScore)

kable(cy2017NL, format = "html", table.attr = 'width="100%"') %>% kable_styling(position = "left")
Name Tm Lg CyVote W L ERA SO GS gScore
Max Scherzer WSN NL 1 16 6 2.51 268 31 792
Stephen Strasburg WSN NL 3 15 4 2.52 204 28 657
Clayton Kershaw LAD NL 2 18 4 2.31 202 27 635
Zack Greinke ARI NL 4 17 7 3.20 215 32 605
Gio Gonzalez WSN NL 6 15 9 2.96 188 32 587
Jacob deGrom NYM NL 8 15 10 3.53 239 31 540
Robbie Ray ARI NL 7 15 5 2.89 218 28 507
Carlos Martinez STL NL NA 12 11 3.64 217 32 506
Alex Wood LAD NL 10 16 3 2.72 151 25 503
Jimmy Nelson MIL NL 9 12 6 3.49 199 29 492

2016 AL

cy2016AL <- p16 %>% select(2, 4, 5, 37, 6, 7, 9, 23, 11, 36) %>% 
  filter(Lg == "AL") %>% 
  top_n(10, gScore) %>% 
  arrange(-gScore)

kable(cy2016AL, format = "html", table.attr = 'width="100%"') %>% kable_styling(position = "left")
Name Tm Lg CyVote W L ERA SO GS gScore
Justin Verlander DET AL 2 16 9 3.04 254 34 741
Chris Sale CHW AL 5 17 10 3.34 233 32 697
Corey Kluber CLE AL 3 18 9 3.14 227 32 685
Rick Porcello BOS AL 1 22 4 3.15 189 33 684
Jose Quintana CHW AL 11 13 12 3.20 181 32 585
Masahiro Tanaka NYY AL 8 14 4 3.07 165 31 576
Aaron Sanchez TOR AL 7 15 2 3.00 161 30 568
David Price BOS AL NA 17 9 3.99 228 35 556
J.A. Happ TOR AL 6 20 4 3.18 163 32 529
Cole Hamels TEX AL NA 15 5 3.32 200 32 487

2016 NL

cy2016NL <- p16 %>% select(2, 4, 5, 37, 6, 7, 9, 23, 11, 36) %>% 
  filter(Lg == "NL") %>% 
  top_n(10, gScore) %>% 
  arrange(-gScore)

kable(cy2016NL, format = "html", table.attr = 'width="100%"') %>% kable_styling(position = "left")
Name Tm Lg CyVote W L ERA SO GS gScore
Max Scherzer WSN NL 1 20 7 2.96 284 34 795
Madison Bumgarner SFG NL 4 15 9 2.74 251 34 752
Johnny Cueto SFG NL 6 18 5 2.79 198 32 733
Clayton Kershaw LAD NL 5 12 4 1.69 172 21 709
Jon Lester CHC NL 2 19 5 2.44 197 32 704
Kyle Hendricks CHC NL 3 16 8 2.13 170 30 689
Jose Fernandez MIA NL 7 16 8 2.86 253 29 672
Noah Syndergaard NYM NL 8 14 9 2.60 218 30 649
Jake Arrieta CHC NL 9 18 8 3.10 190 31 634
Tanner Roark WSN NL 10 16 10 2.83 172 33 622

2015 AL

cy2015AL <- p15 %>% select(2, 4, 5, 37, 6, 7, 9, 23, 11, 36) %>% 
  filter(Lg == "AL") %>% 
  top_n(10, gScore) %>% 
  arrange(-gScore)

kable(cy2015AL, format = "html", table.attr = 'width="100%"') %>% kable_styling(position = "left")
Name Tm Lg CyVote W L ERA SO GS gScore
Dallas Keuchel HOU AL 1 20 8 2.48 216 33 830
David Price TOT AL 2 18 5 2.45 225 32 761
Corey Kluber CLE AL 9 9 16 3.49 245 32 701
Chris Archer TBR AL 5 12 13 3.23 252 34 673
Chris Sale CHW AL 4 13 11 3.41 274 31 670
Sonny Gray OAK AL 3 14 7 2.73 169 31 652
Carlos Carrasco CLE AL 13 14 12 3.63 216 30 591
Jose Quintana CHW AL NA 9 10 3.36 177 32 552
Felix Hernandez SEA AL 7 18 9 3.53 191 31 547
Danny Salazar CLE AL NA 14 10 3.45 195 30 512

2015 NL

cy2015NL <- p15 %>% select(2, 4, 5, 37, 6, 7, 9, 23, 11, 36) %>% 
  filter(Lg == "NL") %>% 
  top_n(10, gScore) %>% 
  arrange(-gScore)

kable(cy2015NL, format = "html", table.attr = 'width="100%"') %>% kable_styling(position = "left")
Name Tm Lg CyVote W L ERA SO GS gScore
Clayton Kershaw LAD NL 3 16 7 2.13 301 33 1011
Jake Arrieta CHC NL 1 22 6 1.77 236 33 998
Zack Greinke LAD NL 2 19 3 1.66 200 32 947
Max Scherzer WSN NL 5 14 12 2.79 276 33 844
Madison Bumgarner SFG NL 6 18 9 2.93 234 32 759
Gerrit Cole PIT NL 4 19 8 2.60 202 32 717
Jacob deGrom NYM NL 7 14 8 2.54 205 30 704
Matt Harvey NYM NL NA 13 8 2.71 188 29 644
Jon Lester CHC NL NA 11 12 3.34 207 32 632
John Lackey STL NL 9 13 10 2.77 175 33 616

It’s really amazing to consider deGrom’s 2018 gScore when compared to the last three season. Since Kershaw, Arrieta and Greinke in 2015, no one had come close to cross 900 points again.

Given that deGrom only has 10 wins, let’s look at one more historical comparison. In 2010, the year that started to orient thinking away from slavishly considering wins and losses, the Mariners’ Felix Hernandez (13-12, 2.27) won the AL Cy Young vote despite his mediocre record. If we examine the gScores in the AL for that season, he compiled 860 points, 150 more than the Angels’ Jered Weaver (13-12, 3.01) and 168 more than Verlander (18-9, 3.37). David Price (19-6, 2.72), who was seventh in gScore points that season, was the Cy Young runner-up, and CC Sabathia (21-7, 3.18) was third with an gScore of only 661. Weaver, who led the league in strikeouts by one over Hernandez, finished fifth.

This application of the Game Score formula seems to do a good job of pushing forward the best candidates and correctly matched four of the last six Cy Young winners. It hews closely to my own biases so I like it, but it certainly won’t satisfy all fans. It likes power pitchers with lots of strikeouts and might not reward a groundball pitcher with fewer Ks. It completely eliminates relievers from Cy Young consideration. But that’s how I like it.

Based on the prevailing wisdom, this measurement will probably split this year’s Cy Young races, correctly predicting deGrom and missing out on Snell. You can compare this gScore with other measures like ESPN’s Cy Young predictor formula and Tango’s predictor that takes wins into account The BBWAA votes have to be in shortly after the end of the regular season. The results will be announced in November.