2026 Midterm Elections Forecast

Step 1: Data Collection

Polling data is fetched daily from the VoteHub API. Generic ballot polls and presidential approval polls from the last 180 days are collected. Each poll is weighted by pollster quality (using FiveThirtyEight-style grades: A+ = 3.0, A = 2.7, down to C = 1.0) and recency.

District fundamentals include Partisan Voter Index (PVI) calculated from 2024 presidential results by congressional district (sourced from Wikipedia), and incumbency status for all 435 House seats.

Cook Political ratings are scraped daily from cookpolitical.com. For redistricted states, Cook ratings are used to adjust PVI values where the 2024 presidential baseline may not reflect the new district lines.

Step 2: Infer National Environment

The national political environment is inferred from generic ballot polls using PyMC Bayesian inference. The model treats each poll as a noisy observation of the true latent national sentiment:

`y_i ~ Normal(μ + house_effect[pollster], σ_i)`	Poll observation model
`μ ~ Normal(0, 5)`	Latent national environment
`house_effect ~ Normal(0, σ_house)`	Pollster bias
`σ_house ~ HalfNormal(2)`	Magnitude of house effects

The prior μ ~ Normal(0, 5) is a weakly informative prior centered at a neutral national environment. The standard deviation of 5 points assigns roughly 95% probability to environments between D+10 and R+10, covering essentially all plausible national environments in modern U.S. elections while letting the polling data drive the posterior.

The posterior is sampled via MCMC with 4 chains × 500 draws each. After sampling, a deterministic adjustment is applied for presidential approval:

μ_adjusted = μ + (-0.3 × 0.3 × net_approval)

This reflects the historical relationship where each point of net presidential approval corresponds to roughly 0.1 points on the generic ballot (with 30% weight given to approval data).

Step 3: Parameter Training

Model parameters were fitted using PyMC on 2018 and 2022 midterm election results (797 total district-races). The training process learned:

β_pvi = 0.48: Each point of PVI translates to ~0.48 points of Dem vote share
β_inc = 2.2: Incumbency advantage of ~2.2 points
β_nat = 0.66: National environment coefficient
σ_regional = 0.54: Standard deviation of regional effects
σ_district = 3.7: Base district-level uncertainty

The fitted model achieves R² = 0.94 and RMSE = 3.9 points on historical data. Parameters are stored in data/processed/learned_params.json.

Step 4: District Vote Share Model

For each district, the expected Democratic vote share is:

vote_share = 50 + β_pvi × PVI + β_inc × Inc + μ_region + β_nat × μ_national + ε

Where:

PVI: Partisan Voter Index (D+10 = +10, R+10 = -10)
Inc: Incumbency (+1 = Dem incumbent, -1 = Rep incumbent, 0 = open seat)
μ_region: Regional effect for one of 10 political regions (see below)
μ_national: National environment from Step 2
ε: District-level noise, scaled by competitiveness

District uncertainty (σ_d) is scaled using a logistic function on |PVI|: safe districts (|PVI| > 15) have less uncertainty (~2 points), while competitive districts have full uncertainty (~4.5 points).

Step 5: Monte Carlo Simulation

10,000 simulations propagate uncertainty through the entire model. For each simulation:

Sample μ_national from the PyMC posterior distribution
Sample β_pvi, β_inc, β_nat from their fitted posterior distributions
Sample regional effects: μ_region ~ Normal(0, 0.54) for each region
Sample district noise: ε ~ Normal(0, σ_d) for each district
Compute vote share for all 435 districts
Determine winner in each district (>50% = Dem win)
Count total Democratic seats

This produces a distribution of 10,000 possible seat outcomes. The probability of Democratic majority is the fraction of simulations where Dems win ≥218 seats.

Regional Structure

The model uses FiveThirtyEight's 10 political regions to capture geographic correlation in election outcomes. During each simulation, a regional effect is sampled for each region, and all districts within that region share the same effect.

New England	ME, NH, VT, MA
Mid-Atlantic	NY, NJ, DE, MD, RI, CT
Rust Belt	IL, IN, OH, MI, WI, PA, MN, IA
Southeast	FL, GA, NC, VA
Deep South	SC, AL, MS, AR, TN, KY, WV, MO
Texas Region	TX, OK, LA
Plains	ND, SD, NE, KS
Mountain	ID, MT, WY, UT, AK
Southwest	AZ, NV, NM, CO
Pacific	CA, OR, WA, HI

Step 6: Daily Updates

A GitHub Actions workflow runs daily at 9am ET:

Fetch latest polls from VoteHub API
Scrape current Cook Political ratings
Re-run PyMC inference for national environment
Run 10,000 Monte Carlo simulations
Update forecast JSON and website
Commit and push changes

Data Sources

Generic Ballot Polls	VoteHub API
Presidential Approval	VoteHub API
District PVI	Wikipedia (2024 results)
Race Ratings	Cook Political Report
Congressional Maps	U.S. Census TIGER/Line

Limitations

This model has several known limitations:

No race-level polling: All predictions are driven by fundamentals (PVI, incumbency) and the national environment. District-level and state-level polls are not yet incorporated.
No candidate quality: The model treats all candidates as generic party representatives. It does not account for candidate quality, fundraising, scandals, or other race-specific factors.
No special election data: Recent special election results, which can be leading indicators, are not systematically incorporated.
Limited historical data: Parameters are fitted on only 2018 and 2022 midterms, which may not capture the full range of electoral dynamics.
Staged inference: Parameters, national environment, and forecasts are estimated separately rather than jointly, which may underestimate total uncertainty.

Source Code

The complete model implementation is open source:

github.com/grantbw4/2026-midterms-forecast

Key files: models/national_environment.py (PyMC inference), models/hierarchical_model.py (Monte Carlo simulation), models/parameter_fitting.py (historical training).

2026 House Forecast

Seat Breakdown

Political Environment

2026 Senate Forecast

All 100 Senate Seats (33 up in 2026)

Seats Up for Election

Congressional District Map

Seat Distribution

All 435 Districts

Forecast Timeline

Methodology