College Forecast: 2020 US Election Forecast

Estimating voting intention and predicting Electoral College votes for the Presidential Election

by Patrick English [@pme_politics]

The final headline forecast

Welcome to my forecasting model website! Here you can find my final forecast for the US 2020 Presidential Election, and see how it has changed over time. It's been a remarkably stable race, but while Biden is the overwhelming favourite, Trump still has a slight path to victory.

According to the forecasting model, the candidates have the following chances of winning the election on November 3rd:

Change over time

How has the forecast changed since it was launched on 28th September?

Voting intention

Below is the latest estimated voting intention figures from the polling model. This regression-based approach statistically models, rather than simply aggregating, polling data from all companies since the start of May in order to produce daily estimates. The figures below therefore account for polling house effects, sample size, and composition. See the bottom of the page for full methodology.

A forecast fit purely on this national polling average (using a Uniform National Swing calculation) generates the following chances of each candidate winning:

Electoral College Forecast

The Electoral College Forecast model combines the voting intention figures above with a separate model to swing states, using state polls, and works out vote intention across each of the 50 US states (plus the District of Columbia). The map below shows the central prediction of what the model believes will happen on Election Day (3rd November, 2020). There will be a lot of uncertainty around some of these predictions - particularly in the swing states. Check out the state-level forecasts below to see detailed information (including uncertainty) in each state race.

According to this forecast, the candidates are projected to win the following number of states:

State win figures are modal counts from the simulation (who wins the state most often across 8000 simulations) with 95% Bayes credible intervals.

The full predicted range of Electoral College votes is as follows:

The central estimate of Electoral College vote wins is worked out slightly different to most others. In this forecast, the central projection is drawn from the modal estimate of state wins in each of the 8000 simulations that the model runs, rather than a mean or median number of EC votes from the simulations. This means that the forecasted Electoral College votes won't change unless a state winner changes in the model, and is likely to move by around 20 votes (or more) in one go as a swing state flips between the two candidates.

Implied state voting intention forecast

Select a state from the 50 states (plus District of Columbia) to view the vote intention forecast for that specific location. The graph will show the central (median) predicted vote share for the candidates, and the 95% credible intervals surrounding these estimates (basically, a plausible range of values around the median where the vote shares could pretty reasonably also be).

According to the model forecast, the probabilities of each candidate winning the state are:

How has the forecast in your chosen state changed over time?

Keep in mind that any jumps in a state-level forecast could be do with major model changes. See notes below for dates and details on when these occured.

Acknowledgements and thanks

I would like to credit Jack Bailey (University of Manchester) for authoring significant parts of this code - particularly the polling model. This model is very similar to the forecast we produced for the 2019 UK General Election. If this model does turn out to be any good, it's probably mostly his doing. I would also like to thank G. Elliot Morris (The Economist) for a whole bunch of sanity checking, advice giving, and idea bouncing. He was tremendously helpful in the process of putting all of this together, and I would not have been able to do it without him.

This application is hosted on a server machine at the Q-Step Centre, University of Exeter. I would like to thank Robert O'Neale, Travis Coan, and the Politics Department at the University of Exeter for their support in building and maintaining it.


Polling model - modelling voting intention

8000 simulations of potential voting intention are calculated and averaged in this model, which is estimated using only polling data and weakly informative priors. The range figures presented are 95% credible intervals, and probabilities are drawn directly from the posterior distribution of vote share and state result estimations. The polling model collects the latest available polling data published by all US polling houses published on the national and state-level wikipedia pages. Survey Monkey polls are not included. The modelling process is more advanced than a simple aggregation, in that it uses a Bayesian dirichlet regression model which fits a spline for time (reducing overall sensitivity to random variation in headline voting intention), adds random effects for Survey House (pollster), models change in candidate support using a spline fitted through time, and the runs a separate model for each swing state using state polling to get mroe accurate predicted shares in these key battleground areas. The model is fit in R, making heavy use of the brms, rstan, and Rcpp packages. Weakly informative priors are used which cover all possible values in voting intention, and the k-value of the spline is set high and the prior for the correlation between group-level effects (i.e. pollster and time) is set low to generate a credible amount of uncertainty.

State forecast model - applying credible candidate vote shares

The state forecasting model picks up the national implied voting intention and calculates an implied 'swing' in one candidate's vote share from their (or their respective party's) recorded nationwide share in 2016, and applies it uniformally to all states to create new estimated voting intention. Then, for states where polling data are regularly and reliably available, these estimates are replaced with the actual modelled vote shares for the candidates (estimates are unique to each state, but correlated across election simulations). This gives a range of plausible vote shares for the candidates in each of the 50 states plus the District of Columbia. Then, for each of the 8000 estimated vote shares for the candidates in each state, winning probabilities are calculated by summing up the simulated elections in which one candidate's share is greater than the other. Point estimates, probability, and uncertainty and error within the model are therefore consistent and carried through from the initial estimation to the final output.

Changes to the model - 6th October

On October 6th some significant changes were made to the forecast modelling process. The overall impact on the forecast was fairly negligable, but important. Namely, enough polling data had been gathered across the majority of swing states so that voting intention could be estimated in each state separately (with their own unique models). Furthermore, I added a simulated random walk of the candidate's median vote share in each state between the current estimate and Election Day, giving more uncertainty to the Election Day estimate and assuming the possibility that the race narrows slightly by some amount (drawn from a normal distribution) as we move toward polling day (though also including a probability that it also expands). Lastly, an adjustment was also made to the collating of simulations to produce more effective correlation between candidate performances in the national model and swing state models. The overall impact of the changes was a slight overall improvement in Donald Trump's election chances. To account for the varying amoung of polling information available across swing states, three tiers of state poll models with varying amounts of uncertainty specified in the arguments are used. State polling is being constantly monitored to check if further states could be estimated with their own model (beginning in the group with least uncertainty specificed in the model).

Changes to the model - 20th October

On October 20th some further changes were made to the forecast modelling process. Namely, each of Nevada, Colorado, and New Hampshire were moved out of the national swing estimation and into their own forecasting models. The overall impact on the forecast was to add a little more probability in the overfall forecast in Trump's favour - driven particularly by a sudden change in probabilities in Nevada.

Changes to the model - Final forecast

In the week leading to the final forecast, the 'random walk' simulation of polling movement was tuned down (to about half it's original size), and replaced by a generic 'noise' element in the final vote share estimations (to avoid the uncertainty intervals passing below a typical margin of error for polling forecasts). These changes were made to reflect the high number of early voters, and the lack of time for a dramatic polling turnaround.

Questions, Queries, Contact

For any questions or queries about the model, or if you'd like to use any of the predictions or forecasts in your own work, publications, or presentations, please just reach out to me on Twitter (@PME_Politics), or via email (p.english[AT]