MLB home run prediction model

Specification is collapsed below. Leaderboard section (optional; use ?leaderboard=1 to compute the table).

Goal & implementation

Goal: Estimate the pregame probability that a hitter will hit at least one home run in a given game, without using sportsbook odds.

The implementation lives in _inc/mlb_homerun_model.php (used by total-bases tools and HR history snapshots). Optional park/weather MySQL tables: mysql_schema_homerun.sql.

Notation note: HitterBaseRaw is a weighted sum of six regressed HR/PA rates (weights 0.18, 0.42, 0.10, 0.22, 0.05, 0.03). It is not the product of those six rates. HitterSkill then multiplies that sum by FBIndex and BarrelIndex.

Final formula

Step 1 — Adjusted HR rate per plate appearance

AdjHRRate_PA = HitterSkill × PitchingFactor × GameEnv × ConfirmedLineupFactor

HitterSkill = HitterBaseRaw × FBIndex × BarrelIndex

Step 2 — Game probability (at least one HR)

HRProb_Game = 1 − (1 − AdjHRRate_PA) ^ ExpectedPA

Definitions: AdjHRRate_PA — adjusted probability of a HR in one PA. HRProb_Game — probability of ≥1 HR in the game. ExpectedPA — expected plate appearances (primarily from lineup spot).

Regressed rates & HitterBaseRaw

All component rates are HR per plate appearance, regressed toward league LgHRRate = LeagueHR / LeaguePA.

CareerR = (CareerHR + 150 × LgHRRate) / (CareerPA + 150)

Y1R     = (HR_365 + 80 × LgHRRate) / (PA_365 + 80)

L10R    = (HR_10 + 50 × LgHRRate) / (PA_10 + 50)

SplitR  = (HR_vsHand + 100 × LgHRRate) / (PA_vsHand + 100)
          — vs same SP handedness (vr / vl)

TimeR   = (HR_daynight + 100 × LgHRRate) / (PA_daynight + 100)
          — day vs night for today’s game

BvPR    = (HR_vsSP + 150 × LgHRRate) / (PA_vsSP + 150)
HitterBaseRaw =
    0.18 × CareerR
  + 0.42 × Y1R
  + 0.10 × L10R
  + 0.22 × SplitR
  + 0.05 × TimeR
  + 0.03 × BvPR
Hitter quality multipliers
FBIndex    = (HitterFBpct / LeagueFBpct) ^ 0.4

BarrelIndex = (HitterBarrelPct / LeagueBarrelPct) ^ 0.7

On this site, Statcast barrel% is not wired through the public API for every player; BarrelIndex uses a SLG vs league SLG proxy with the same exponent. FBIndex uses advanced hitter fly-ball share when available.

Pitching factor
PitchingFactor =
    (StarterSplitHRFactor ^ 0.7)
  × (PitcherFBFactor ^ 0.3)
  × (BullpenFactor ^ 0.2)

StarterSplitHRFactor = (StarterHRAllowedRate_split / LeagueHRAllowedRate) ^ 0.5
                       — cap ~0.85–1.20

PitcherFBFactor = (PitcherFBAllowedPct / LeagueFBAllowedPct) ^ 0.30
                  — cap ~0.92–1.10

BullpenFactor = (BullpenHRAllowedRate / LeagueHRAllowedRate) ^ 0.25

Starter split uses HR/BF (or equivalent) to the batter’s effective side. Bullpen on this site uses opponent team season pitching HR rate vs league when reliever-only splits are unavailable.

Alternative (advanced)

StaffHRRate = 0.65 × StarterHRAllowedRate_split + 0.35 × BullpenHRAllowedRate

StaffFactor = (StaffHRRate / LeagueHRAllowedRate) ^ 0.5

Not used as the primary path on this site; listed for reference.

Game environment
GameEnv = ParkFactor × WindFactor × TempFactor × HumidityFactor × RoofFactor

WindFactor      = 1 + (0.012 × WindOutMPH) − (0.010 × WindInMPH)   — cap ~0.85–1.15; 1.0 if roof closed

TempFactor      = 1 + 0.003 × (TempF − 70)   — cap ~0.90–1.08; 1.0 if roof closed

HumidityFactor  = 1 + 0.001 × (HumidityPct − 50)   — cap ~0.97–1.03; 1.0 if roof closed

ParkFactor      — L/R handed park HR factor; typical range ~0.85–1.20

Roof closed: set WindFactor, TempFactor, HumidityFactor to 1.00 and apply RoofFactor as the indoor adjustment. Park/weather can be populated from MySQL (hr_park_factor, hr_game_context) when configured; otherwise defaults are neutral.

Lineup & opportunity
ConfirmedLineupFactor:
  Confirmed starter  = 1.00
  Projected, not sure = 0.85
  Not in lineup      = 0.00

ExpectedPA by lineup spot:
  1 → 4.75   2 → 4.65   3 → 4.55   4 → 4.45   5 → 4.30
  6 → 4.15   7 → 4.00   8 → 3.90   9 → 3.80

When the model is run in code, rows follow baseball.php lineups: starters use batting-order xPA; other active hitters use a lower xPA and factor 0.85 when not in the posted nine.

Complete summary & notes
HRProb_Game = 1 − (1 − AdjHRRate_PA) ^ ExpectedPA

AdjHRRate_PA = HitterSkill × PitchingFactor × GameEnv × ConfirmedLineupFactor

HitterSkill = HitterBaseRaw × FBIndex × BarrelIndex

Notes

  • Use HR per plate appearance, not raw totals alone.
  • Regress small samples (L10, BvP, day/night, splits) toward LgHRRate.
  • Highest leverage: recent HR/PA, platoon split, barrel/power proxy, starter HR allowed by side, park, lineup spot, confirmation.
  • Lower weight: BvP history, day/night, humidity.
  • Retractable roof: roof status mainly controls whether outdoor weather applies.

Neutral leaderboard (all active roster hitters)

The full-league table is not computed on a normal page view (it loads MySQL, calls the MLB Stats API hundreds of times per load, and can hit time/memory limits). Use the button below when you want it.

Compute leaderboard (slow) — same season: 2026. Other year: ?leaderboard=1&season=YYYY