Blog Post 6: The Ground Game

First, I created the variable for turnout, dividing the combined votes for Democrats and Republicans by the citizen voting age population (CVAP). The CVAP data available to us is from 2012-2020. Comparing turnout to vote share, there is almost a null relationship to note. For Democratic vote share, the correlation is 0.002, indicating that turnout has an extremely weak positive effect on votes. The adjusted R^2 is -0.003, demonstrating the model has such low predictive power that the average is more predictive. For Republican vote share, the correlation was the exact inverse of the Democratic model, at -0.002. The adjusted R^2 was also -0.003. This is interesting given the inverse relationship, although that makes sense as the model is just using two party vote share as the independent variable and the turnout is the same for each party.

When I subset the data to only include midterm years - 2014 and 2018 - the relationship between vote share and turnout becomes slightly stronger but not by much. The correlation for Democratic vote share is 0.092, so for Democrats, an increase in turnout by about 1% is associated with an increase in vote share by about 0.1%. The R^2 also did increase but only to 0.025. The correlation for Republican vote share is -0.092, showing the same inverse relationship as seen when using data from all election years.

## 
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                       dem_votes_major_percent  
## -----------------------------------------------
## turnout                        0.002           
##                               (0.025)          
##                                                
## Constant                     48.635***         
##                               (1.436)          
##                                                
## -----------------------------------------------
## Observations                    395            
## R2                            0.00001          
## Adjusted R2                   -0.003           
## Residual Std. Error      5.967 (df = 393)      
## F Statistic             0.005 (df = 1; 393)    
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01
## 
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                       dem_votes_major_percent  
## -----------------------------------------------
## turnout                       0.092**          
##                               (0.040)          
##                                                
## Constant                     44.603***         
##                               (1.995)          
##                                                
## -----------------------------------------------
## Observations                    168            
## R2                             0.031           
## Adjusted R2                    0.025           
## Residual Std. Error      5.293 (df = 166)      
## F Statistic            5.251** (df = 1; 166)   
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01
## 
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                       rep_votes_major_percent  
## -----------------------------------------------
## turnout                       -0.002           
##                               (0.025)          
##                                                
## Constant                     51.365***         
##                               (1.436)          
##                                                
## -----------------------------------------------
## Observations                    395            
## R2                            0.00001          
## Adjusted R2                   -0.003           
## Residual Std. Error      5.967 (df = 393)      
## F Statistic             0.005 (df = 1; 393)    
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01
## 
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                       rep_votes_major_percent  
## -----------------------------------------------
## turnout                      -0.092**          
##                               (0.040)          
##                                                
## Constant                     55.397***         
##                               (1.995)          
##                                                
## -----------------------------------------------
## Observations                    168            
## R2                             0.031           
## Adjusted R2                    0.025           
## Residual Std. Error      5.293 (df = 166)      
## F Statistic            5.251** (df = 1; 166)   
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01

To continue making my predictive model, I added incumbency and expert predictions to my data. Interestingly, incumbency did not hold as a strong predictor for Democratic (correlation of 0.108) or Republican vote share, even having a weak negative correlation with Republican vote share (correlation -0.108). This may be a marker that incumbents and those of the president’s party are punished at the ballot box during midterm years.

## 
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                       dem_votes_major_percent  
## -----------------------------------------------
## incumbency                     0.108           
##                               (0.458)          
##                                                
## avg_rating                   -2.662***         
##                               (0.112)          
##                                                
## turnout                      0.105***          
##                               (0.019)          
##                                                
## Constant                     55.544***         
##                               (1.118)          
##                                                
## -----------------------------------------------
## Observations                    168            
## R2                             0.782           
## Adjusted R2                    0.778           
## Residual Std. Error      2.526 (df = 164)      
## F Statistic          195.932*** (df = 3; 164)  
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01
## [1] 48.61066
## [1] 56.14127
## [1] 51.97428
## 
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                       rep_votes_major_percent  
## -----------------------------------------------
## incumbency                    -0.108           
##                               (0.458)          
##                                                
## avg_rating                   2.662***          
##                               (0.112)          
##                                                
## turnout                      -0.105***         
##                               (0.019)          
##                                                
## Constant                     44.456***         
##                               (1.118)          
##                                                
## -----------------------------------------------
## Observations                    168            
## R2                             0.782           
## Adjusted R2                    0.778           
## Residual Std. Error      2.526 (df = 164)      
## F Statistic          195.932*** (df = 3; 164)  
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01
## [1] 48.13327

For the 2022 data, I averaged the turnout from the 2014 and 2018 midterms, which was 48.61%. Just to compare, the average turnout rate for all the data was 56.14%, demonstrating the significant drop in turnout that occurs during midterm years. For the expert rating, I pulled the Generic Ballot from 538 on October 18, 2022, which was 45.4% for Democrats and 44.9% for Republicans. For incumbency, I used the president’s party as a proxy, so 1 for Democrats and 0 for Republicans.

With this data, my model predicts Democrats will win 51.97% of the vote share and Republicans will win 48.13%.

I also created a binomial logical prediction for my model, using the same variables: incumbency, turnout, and expert rating. The p values for the variables in both models are all less than 0.05, indicating that they are statistically significant predictor variables. McFadden’s R^2 values (0.939 for Republican vote share and 0.917 for Democratic vote share) were both very high which would indicate that the model fits the data pretty well. However, the coefficient estimates are very low, indicating small and weak relationships. Creating a prediction, the Democratic probability was 0.248 and the Republican probability was 0.23, meaning that there is some likelihood of Democrats winning the two party vote share.

## 
## Call:
## glm(formula = cbind(rep_votes, cvap - rep_votes) ~ avg_rating + 
##     incumbency + turnout, family = binomial, data = combined1)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -170.884   -14.698     0.269    16.738   121.941  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -2.817e+00  9.582e-04 -2939.9   <2e-16 ***
## avg_rating   8.406e-02  8.866e-05   948.2   <2e-16 ***
## incumbency  -1.370e-02  3.715e-04   -36.9   <2e-16 ***
## turnout      2.739e-02  1.379e-05  1985.6   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 5045262  on 394  degrees of freedom
## Residual deviance:  305106  on 391  degrees of freedom
## AIC: 310406
## 
## Number of Fisher Scoring iterations: 3
## [1] 0.9395263
## [1] 0.2300117
## 
## Call:
## glm(formula = cbind(dem_votes, cvap - dem_votes) ~ avg_rating + 
##     incumbency + turnout, family = binomial, data = combined1)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -140.415   -20.022    -0.674    20.496   153.812  
## 
## Coefficients:
##               Estimate Std. Error   z value Pr(>|z|)    
## (Intercept) -2.020e+00  9.382e-04 -2153.554   <2e-16 ***
## avg_rating  -8.030e-02  8.884e-05  -903.909   <2e-16 ***
## incumbency   8.830e-04  3.780e-04     2.336   0.0195 *  
## turnout      2.424e-02  1.375e-05  1763.036   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 4847834  on 394  degrees of freedom
## Residual deviance:  402277  on 391  degrees of freedom
## AIC: 407565
## 
## Number of Fisher Scoring iterations: 3
## [1] 0.9170192
## [1] 0.2486253

Overall, these models and correlations demonstrate that there does not appear to be a clear relationship between turnout and vote share, for either Democrats or Republicans. However, given the sample size of just two years of midterms, and some lack of data for different districts within those years, this may be in part due to the data errors. This also may be because both campaigns run strong canvassing operations, almost nullifying the effect across one another. Given the significant increase in turnout in the 2018 midterm as a response to Trump’s 2020 election, that may also have skewed the data, so it will be interesting to see what occurs this cycle.

References: Enos, R., & Fowler, A. (2018). Aggregate Effects of Large-Scale Campaigns on Voter Turnout. Political Science Research and Methods, 6(4), 733-751. doi:10.1017/psrm.2016.21