Can the Phillies really turn it around? A statistical defense of the run differential argument
If you’ve been following our podcasts for the last few months, you’ve heard our spring optimism fade into depressed summer resignation: The Phillies aren’t good, won’t get better, and should start to sacrifice their immediate assets in hopes of building a brighter (long-term) future.
I routinely cited their dismal run differential as evidence that their true talent level was far below their middling record, and suggested that it was nearly inconceivable that they could continue to hold on to their mediocrity, much less make a legitimate run at winning the division.
But last week I read an article by Joecatz of The Good Phight that piqued my interest and had me challenging my assumptions. If you didn’t click over there, here are some of the money quotes:
- At the 90 game mark, the 2012 Phillies had a run differential of -23. Over the remainder of the season, without Hunter Pence and Shane Victorino, and later Joe Blanton, They went 42-30 with a run differential of +29. Must be an anomaly, right?
- Well, of the remaining 8 teams that at the 90 game point had – run differentials, 4 of those teams (5 total) went on to have positive run differentials the rest of the way. The Phillies +29 was the lowest total of any of those teams.
- There were 7 teams with positive run differentials in the first 90 games of 2012. 3 of those teams showed a negative run differential the rest of the way
The reason run differential swings so dramatically at the mid point of the season, league wide, is because of the trade deadline. Teams change. Rosters change, people change.
— Joecatz, TheGoodPhight.com
Joe makes some valid points here. But while Joe was mostly arguing thats using run differential to predict 2nd half results with certainty was ill-advised — and I’m never, ever a proponent of certainty — I didn’t really believe that you could just write off run differential as a midseason predictor of 2nd-half results either. And so, the following…
In order to test what level of predictability 1st-half run differential had for 2nd-half performance, I collected isolated before-and-after standings for July 4th of the last ten full seasons (’03-’12). I chose ten seasons because it gave me a nice round sample of 300 team-seasons to work with and arbitrarily chose to make the split at July 4th because the holiday tends to fall right around the midseason mark. Also, the Phillies have played 11 games since the 4th this season, so it allowed me a small sample to work with for this year.
For the plot above, I calculated the run-differential per game ((Runs Scored – Runs Against)/Games Played) for each team in the first half and plotted it against the same statistic for the second half. I then added a line of best fit to help illustrate the relationship between the two. I’ve written in the correlation coefficient, R-squared value, and p-values for stat nerds like me, but I’ll try to keep this in accessible terms for the most part.
The results here are pretty intuitive. Most of the teams are grouped in either the top-right or lower-left quadrants of the plot. The top-right are teams that had a positive run differential in the 1st half, and followed it up with a positive run differential in the 2nd half as well. The lower-left had negative run-differentials all around.
This makes sense. Most good stay good and most bad teams stay bad. Nothing revolutionary here. It fits logically that the two halves of the season are positively correlated.
The line of best fit — the line that most closely approximates the relationship between every x-y coordinate — goes through the point (0,0). This makes sense because the league average run differential every year is always zero. League-wide, runs are zero-sum. But it’s interesting, that the slope of the line of best fit is shallow. In other words, if the last ten seasons are any indication, on average a team can be expected to move towards league average (0) from their 1st-half run differential. The equation for the line of best fit is y = .608x, so a theoretical team that scores a -1 run differential per game in the first half could historically be expected to record a -.608 run differential per game in the second half.
This is good news for the Phillies, because it means on average they are historically unlikely to record a run differential as poor as the -.5 runs per game that they recorded before July 4 of this season.
But how likely is it that they will switch to a positive differential for the whole second half?
|Run Diff/Game||1st Half +||Stay + to +||Flip + to –|
|Avg 1st Half RD/G||0.49||0.55||0.34|
|Avg 2nd Half RD/G||0.30||0.69||-0.57|
|Run Diff/Game||1st Half –||Stay – to –||Flip – to +|
|Avg 1st Half RD/G||-0.59||-0.70||-.028|
|Avg 2nd Half RD/G||-0.36||-0.74||0.60|
The charts above splits the 300 team-seasons into positive or negative 1st-half run differentials. From there, it’s broken down into groups that show 2nd-half results.
Over the course of the ten seasons I studied, just under 30% of MLB teams switched from positive to negative or vice versa in the second half. Interestingly, 31.1% of positive 1st-half teams flipped, while just 27.9% negative 1st-half teams went positive.
The teams that stayed on the same side tended to get just a little bit better if they were positive or a little bit worse if they were negative, but teams that flipped had much more dramatic results. Those teams recorded, on average, a swing of about .9 runs/game.
It is important to note, however, that teams who stayed positive in run differential for both halves also tended to have higher 1st-half averages than those than ended up flipping, and the same was true in the negative for teams that remained sub-par. Those that flipped were more likely to be hovering nearer to 0 at the halfway point.
But why is it that a greater proportion of positive differential teams swap than negative?
The graphs above represent the distributions of average run differential per game before and after July 4th from 2003-2012.
It’s noticeable that both plots skew slightly negative. That is, the worst teams scored further below zero than the best teams scored above zero. This is true for both plots, but is more pronounced in the 1st-half distribution, Figure 2.
This might help explain why a smaller proportion of teams with negative 1st-halves made the jump to the other side: some teams on the negative side were completely and utterly hopeless. While there is no such thing as a sure winner, it appears that MLB has definitely has had some teams that were destined to lose. With a built in group of teams that were almost certain to fail in both the first and second halves (Early decade Nats, Padres, pre-McCutchen Pirates, Royals, etc. come to mind), this would certainly reduce the proportion of teams that could potentially flip from negative to positive.
Likewise, Figure 1 shows a number of teams that are hovering just barely above average. It’s possible that these teams are sitting at a slightly positive mark but are in fact league average in talent… they just happened to have schedules that featured some of the aforementioned terrible, awful teams.
This is all well and good, but we still haven’t quite tackled which teams are likely to switch and which are likely to stay on the same side of average. So in order to break things down a little further, and knowing that more teams tended to start the year with a plus differential because some really awful teams skewed the distribution of average run differential per game, I ranked each team within it’s given half-season in terms of run differential per game.
|RunDiff/Game Rank||1st Half||2nd Half||2nd Half|
|Rank||#1-10||Stay +||Flip –|
|Rank||#11-15||Stay +||Flip –|
|Rank||#16-20||Stay –||Flip +|
|Rank||#21-30||Stay –||Flip +|
This chart is slightly different than the first one, in that I am not marking whether or not teams remained within their rank set during the second half, but simply whether or not they remained in the same half of the chart. For example, if a team that ranked #2 in run differential per game at the halfway point finished at #13 for the second half, they are still counted as a +.
This is where this whole breakdown really comes to a head for the Phillies. Before July 4th of the 2013 season, the Phillies were ranked 23rd in the MLB in run differential per game with -.500 runs per game. That puts them in the bottom 3rd of the rankings, and historically over the last ten seasons, only 18% of teams in the bottom 3rd have been able to rebound with a second half that was ranked 15th or better. This does not bode well for the remainder of the season, although it shows that a big rebound is far from impossible.
Oddly enough, teams in the middle 3rd were more likely to switch from bottom half to top or vice versa. 57% of total teams in the middle 3rd at the season’s halfway mark switched. This makes a bit of sense, really, because these are teams that whose true talent lies somewhere around the league average. These teams are susceptible to have placed slightly above or below average due to the random chance inherent in baseball (slumps and hot streaks, which sometimes do but generally do not last for a full 162 games), and thus are excellent candidates for regression.
The Phillies beat their Pythagorean Win Expectation by a very significant margin in the 1st-half this year. Some have argued that because they beat their Pythagorean Win% in the first half, they are likely to regress and lose more games than their PythagW% in the second half.
Over the course of my ten season data set, teams that bested their PythagW% in the first half bested it again in the 2nd half 49% of the time. Likewise, teams that did worse than their PythagW% in the first half also did worse in the 2nd half 55%. In total, 53% of teams stayed on the same side of luck, and 47% switched. So there is essentially no effect. The Phillies are just as likely to finish the 2nd-half of the season as a lucky team regardless of their favorable 1st-half result.
What you take from all this really depends on your level of optimism.
So far after July 4th this season, the Phillies are 8-3 in 11 games, scoring 50 runs to 47 against. Their 13-8 win against the Mets on July 19 put them above a .500 record for the 2nd time all season, and the first time since June 6th. At this point, it seems entirely possible that they are one of the lucky 18% that historically can find its way out of the bottom 3rd of the league in 1st-half run differential into the top half. The scatterplot in Figure 1 indicates that there are several examples of teams letting up a similar amount of runs/game in the first half and rebounding to have highly positive results for the 2nd. But there are also far more teams in that range that continued to play poorly.
I, for one, am using this analysis as one piece of my argument that the Phillies should just trade everybody. If they were getting this result with a young, up-and-coming team, I’d be far more likely to stay patient and pray that the early 2nd half returns are indicative of a turnaround. But this team is still very old, still overpaid, and I will continue to argue that trading away their aging, expensive pieces while they still have value is the best course of action.
Then again, that’s assuming that the Phillies’ ownership group isn’t willing to spend infinity money on this roster (or Xfinity money, for that matter). If they plan to just buy a whole new group of talent in a year or two, then perhaps allowing the players currently on the roster to remain at their current salaries is a smarter option.
But if history is any indication, this team has about a 1 in 5 chance of rebounding in the scoring department over the 2nd-half, and on top of that they still have to jump Atlanta just to get to the playoffs.
Here’s one last tidbit before I sign off: Over the past ten years, no World Series winner has ranked lower than 14th in 1st-half run differential per game.
Does that mean that if the Phillies make the playoffs they can’t win? No. But it would make them a pretty incredible outlier. All things considered, with all the probabilities against them, I’m still advocating to sell.
If think I’ve come to some crappy conclusions, here is my data set. Have at it! Feel free to write your own article about how wrong I am. If there’s ever been a topic I’d like to be wrong on, it’s certainly this one.