Flow-field Mortality Forecaster

Modern mortality forecasting follows two main traditions descended from Lee & Carter's seminal 1992 paper [1]. One decomposes age-by-time mortality into a few summary patterns and projects those patterns forward. The other models how fast life expectancy improves at each level of life expectancy. We argue that both can be viewed as restricted versions of a single underlying object: a smooth trajectory through a low-dimensional summary of mortality derived from a multi-way decomposition of the sex × age × country × year mortality array. The unrestricted version of this object is the flow-field forecaster. We compare it against the leading existing methods on two different high quality datasets totaling roughly nine million age-cell forecasts. The early results are encouraging.


What we're seeing so far

We tested four methods on two large mortality datasets: the flow-field; the original Lee–Carter method [1]; Hyndman & Ullah's functional-data extension of Lee–Carter [2]; and the bayesLife/MortCast pipeline [3], [4] that the United Nations uses to produce its mortality forecasts [5]. The Human Mortality Database [6] supplies 48 high-quality national populations and yielded 1.6 million held-out forecast cells in our cross-validation; the United Nations World Population Prospects 2024 [7] supplies 236 countries and yielded 7.4 million held-out cells. On both datasets, the flow-field had the lowest forecast error of the four methods on every measure we examined. These are working results from a manuscript in preparation, and we expect the picture to refine as we finish the writeup and the comparisons receive external scrutiny.

Headline log-MAE and sex-gap MAE for four methods on HMD and WPP Figure 1. Average forecast error, four methods, two datasets, lower is better. Left: error in age-specific log-mortality. Right: error in the male–female mortality gap. Solid bars are HMD; hatched bars are WPP. The flow-field has the smallest bars in every grouping. Differences are roughly a quarter against Lee–Carter and Hyndman–Ullah, and larger against bayesLife/MortCast on HMD. We treat these as preliminary indications pending the full manuscript.
Method cells average error
log-mortality
average bias
log-mortality
average error
sex gap
correlation
sex gap
lower better closer to 0 better lower better higher better
Human Mortality Database — 48 populations
Flow-field1,589,5380.257+0.0410.181+0.398
Lee–Carter1,589,5380.337+0.0960.259+0.334
Hyndman–Ullah1,589,5380.354+0.0950.298+0.289
bayesLife/MortCast1,589,5380.454−0.0690.290+0.263
UN WPP 2024 — 236 atomic countries
Flow-field7,432,5900.211+0.0120.142+0.709
Lee–Carter7,432,5900.290−0.0140.210+0.608
Hyndman–Ullah7,432,5900.305+0.0540.277+0.413
bayesLife/MortCast7,432,5900.303−0.0830.194+0.479

Each row counts only the cells where all four methods produced a forecast and where we have a held-out observation to compare against, so the comparison is as close to apples-to-apples as we could arrange. The size of the gaps between methods is similar across the two datasets — the flow-field's average error is roughly 24% below Lee–Carter's on HMD and 27% below on WPP, roughly 27% and 31% below Hyndman–Ullah's, and roughly 43% and 30% below bayesLife/MortCast's. This confirms similar behavior of the four methods on two high-quality datasets.

Behaviour at long horizons

At short horizons (one to a few years out) the four methods are reasonably comparable. The biggest differences appear at long horizons, where the flow-field's error grows slowly while the others grow more steeply:

Per-horizon log-mortality MAE for four methods on HMD and WPP, 1 to 50 years Figure 2. Average forecast error as a function of forecast horizon, four methods, two datasets, lower is better. From horizon 1 to horizon 50, the flow-field's error grows by about 1.6×, while the others grow by several-fold. At short horizons the methods are close — Hyndman–Ullah is ahead of the flow-field at one to five years out on HMD, where its smoothing helps with year-to-year noise. The pattern is broadly similar on the two datasets.

An interesting structural tradeoff

The bayesLife/MortCast pipeline forecasts life expectancy as a scalar, then reconstructs the age-specific schedule from the projected scalar — an architecture chosen to maintain demographic interpretability and computational tractability at global scale. A consequence we observe in our cross-validation is that information about the age pattern that has been compressed through the scalar bottleneck appears difficult to recover at long horizons. For example, at horizon-band 26–50 years for infant ages on HMD, the pipeline's mean log-MAE is about 1.17 with a positive bias of roughly 0.7 log-units. The flow-field, in which life expectancy is a derived quantity of the score-space trajectory rather than its forecasting target, does not encounter this bottleneck. Whether and where this tradeoff matters in practice is one of the questions we hope to address in the full manuscript.

Sex coherence

Another pattern we observe is in sex coherence — the degree to which forecast female and male mortality surfaces move in synchrony as time advances. In the flow-field method, the joint Tucker decomposition [8] shares the age basis across sexes by construction, while Lee–Carter and Hyndman–Ullah are typically fitted separately by sex (coherent multi-population variants and rotation extensions do exist [9], [10]), and bayesLife/MortCast handles the sex differential through a dedicated joint female-male gap model [11] added on top of the level dynamics. Compared to the other methods, the flow-field's sex-gap correlation declines more slowly with horizon.

Sex-gap correlation between observed and forecast across horizons for four methods on HMD and WPP Figure 3. Correlation between observed and forecast sex gap (male minus female log-mortality), by horizon, four methods, two datasets. All four methods show declining sex-gap correlation as horizon grows, as one would expect, but the flow-field's decline is more gradual in our tests. Differences are modest at short horizons and grow with horizon. The pattern is similar across HMD and WPP, though the absolute correlations are higher on WPP across all methods, reflecting the smoother input estimates.

What this looks like as a paper

The full manuscript introduces the array-normal noise model [12] and the Tucker decomposition [8] as its canonical estimator from first principles, develops the score-representation isometry that motivates a flow-on-subspace forecaster, and offers Lee–Carter, Hyndman–Ullah, and bayesLife/MortCast readings as constrained or marginal cases of the same framework — three established and successful approaches that the framework helps to relate to one another (for a recent review of Lee–Carter and its extensions, see [13]). It then reports the cross-validation comparisons sketched above. A draft is in preparation; a v1 of the empirical half is at arXiv:2603.24299 [14], and the underlying tensor-decomposition framework at arXiv:2603.20518 [15]. A companion Shiny app implements and demonstrates the methods: MDMx.

NB: The numerical results presented here are preliminary, drawn from a manuscript in preparation. Cross-validation numbers can shift as evaluation protocols are refined and as comparisons receive external scrutiny; we expect to learn a great deal from the review process. Treat what's reported here as a research preview rather than a settled comparison.

References & further reading

  1. Lee, R. D. & Carter, L. R. (1992). Modeling and forecasting U.S. mortality. Journal of the American Statistical Association 87(419): 659–671. doi:10.1080/01621459.1992.10475265
  2. Hyndman, R. J. & Ullah, M. S. (2007). Robust forecasting of mortality and fertility rates: A functional data approach. Computational Statistics & Data Analysis 51(10): 4942–4956. doi:10.1016/j.csda.2006.07.028
  3. Raftery, A. E., Chunn, J. L., Gerland, P. & Ševčíková, H. (2013). Bayesian probabilistic projections of life expectancy for all countries. Demography 50(3): 777–801. doi:10.1007/s13524-012-0193-x
  4. Ševčíková, H., Li, N., Kantorová, V., Gerland, P. & Raftery, A. E. (2016). Age-specific mortality and fertility rates for probabilistic population projections. In R. Schoen (Ed.), Dynamic Demographic Analysis, pp. 285–310. Springer. doi:10.1007/978-3-319-26603-9_15
  5. United Nations, Department of Economic and Social Affairs, Population Division (2024). World Population Prospects 2024: Methodology of the United Nations population estimates and projections. UN DESA/POP/2024/DC/NO.10. population.un.org/wpp/.../WPP2024_Methodology-Report_Final.pdf
  6. Human Mortality Database (2026). Max Planck Institute for Demographic Research (Germany), University of California, Berkeley (USA), and French Institute for Demographic Studies (France). www.mortality.org
  7. United Nations, Department of Economic and Social Affairs, Population Division (2024). World Population Prospects 2024, online edition. population.un.org/wpp/
  8. Tucker, L. R. (1966). Some mathematical notes on three-mode factor analysis. Psychometrika 31(3): 279–311. doi:10.1007/BF02289464
  9. Li, N. & Lee, R. (2005). Coherent mortality forecasts for a group of populations: An extension of the Lee–Carter method. Demography 42(3): 575–594. doi:10.1353/dem.2005.0021
  10. Li, N., Lee, R. & Gerland, P. (2013). Extending the Lee–Carter method to model the rotation of age patterns of mortality decline for long-term projections. Demography 50(6): 2037–2051. doi:10.1007/s13524-013-0232-2
  11. Raftery, A. E., Lalic, N. & Gerland, P. (2014). Joint probabilistic projection of female and male life expectancy. Demographic Research 30(27): 795–822. doi:10.4054/DemRes.2014.30.27
  12. Fosdick, B. K. & Hoff, P. D. (2014). Separable factor analysis with applications to mortality data. Annals of Applied Statistics 8(1): 120–147. doi:10.1214/13-AOAS694
  13. Basellini, U., Camarda, C. G. & Booth, H. (2023). Thirty years on: A review of the Lee–Carter method for forecasting mortality. International Journal of Forecasting 39(3): 1033–1049. doi:10.1016/j.ijforecast.2022.11.002
  14. Clark, S. J. (2026). Mortality forecasting as a flow field in Tucker decomposition space. arXiv:2603.24299. arxiv.org/abs/2603.24299
  15. Clark, S. J. (2026). Multi-dimensional mortality. arXiv:2603.20518. arxiv.org/abs/2603.20518
Code, the underlying Quarto computational book, and reproducibility materials will be released with the published paper. Updates and contact: work@samclark.net. Updated 2026-04-27