
How Did Negro League Teams Fare Against Major League Teams?

Run-Scoring Across Decades
Were more runs scored per game in the 1920s vs the 1930s in the Negro Leagues? What about the 1940s? Did that differ from the MLB?
One-Way ANOVA
Click here to download the data set used for the following research.
​
To answer this question, I performed a One Way ANOVA. The factor was the decade, which had three levels: the 1920s, 1930s and 1940s. The y variable was runs per game. Every team that played at least 40 games in a single year between 1920 and 1948 were eligible for inclusion in the data set. The same team in different years were considered two separate data points. For example, the 1941 Kansas City Monarchs and the 1946 Kansas City Monarchs were considered separately. The teams were separated into the three levels, and 51 teams in each level were randomly selected. An ANOVA was performed. Somewhat surprisingly, there was no significant difference between any of the three decades.
​
The assumptions of ANOVA are met. Independence is present, as each team appears in solely 1 of the groups, 1 time each. Performing a Shapiro-Wilk test on the dependent variable, runs per game, yields a p value of 0.2744. This means that the data is not significantly different from being distributed normally. The Scale-Location plot reveals a flat line, indicating that the variance of the error is the same no matter what group you are looking at:
​
​
​
​
​
​
​
​
​
​
​
​
​The ANOVA yielded an F value of 0.6156 and a p value of 0.5417, as shown below:
​
​
​
​​
​
​
​
​
​
For the 1920s, the average runs scored came out to 5.3700. The 1930s mean was 5.4116, while the 1940s mean was 5.2153. It is unnecessary to perform inference on any of the pairwise differences between groups, as none of them are significant, or even close to significant.
​
​​While this is interesting on its own, it can be put in the context of what was happening in the MLB at this time. 1920 to 1929 was a very high-scoring point in MLB history, but saw fewer average runs scored than in the Negro Leagues. Over at baseballpastandpresent.com, the average number of runs scored by a team per game was calculated from 1876 to 2014. Averaging the 10 values for the 1920s comes out to 4.809 runs per game. The 1930s yields an average of 4.933 runs per game, and the 1940 to 1948 saw an average of 4.2550 runs per game.
​
The Negro Leagues certainly seemed to have a higher run-scoring environment than the Major Leagues, although further research could perform a 2-Way ANOVA to verify this. The factors in such an ANOVA would be decade and league (MLB vs Negro Leagues). It also appears that there was a significant drop-off in runs scored in the MLB during the 1940s. A One-Way ANOVA for the MLB could be used to verify this. According to the New Bill James Historical Baseball Abstract, the quality of baseballs used in the MLB was worse during World War II, a possible reason as to why offense was down in the 1940s.
​
​​​
Click here to download the data set used for the ANOVA.
​

