Aligning Incentives for Forecast Accuracy, Relevance, and Efficacy: A New Paradigm for Metaculus Tournaments

Introducing Fortified Essays & Incentive-Compatible Kelly Strategy for Metaculus Tournaments

Baseline Forecast Accuracy

Increasing Relevance and Efficacy: A Reimagined Metaculus Tournament Structure

Updating Metaculus Tournaments: The Calibration Set, the Long-Term Set, The Fortified Essay, and Incentive-Compatible Kelly Criterion Rules

  1. Calibration. Within the Calibration Set are forecasts that resolve within the timeframe of the tournament (e.g. typically one to a few years) — thus providing ground-truth calibration for the larger dataset of forecasts. The Calibration set is made up of questions and forecasts that are essentially identical to those that have been within Metaculus tournaments thus far. Empirical forecast scoring is utilized, with the best-performing forecasters within the tournament receiving tournament prizes.
  2. Long-Term. Long-Term forecasts do not resolve within the timescale of the tournament — their forecast horizons may be a decade out or even more. Such forecasts are included in tournaments because of their utility in shaping decisions in the near-term. However, waiting potentially 10+ years to distribute tournament prizes is usually impractical, so Long-Term Forecasts are considered out-of-sample for the purpose of awarding tournament prizes. That doesn’t mean that Long-Term Forecasts will not be empirically scored on the Metaculus platform, however. They will be part of forecasters’ track records, appropriately weighted in individuals’ Metaculus Scores. We have some plans in the works for additional ways of recognizing Long-Term forecasters.
  3. Fortified Essays. While an essay is “just an opinion,” a fortified essay is an opinion with testable predictions fortifying its claims. These are persuasive essays written in response to tournament prompts with Metaculus forecasts natively embedded within them. Fortified Essays within Metaculus Tournaments will have a separate prize structure and judging process — typically judges will be a selection of respected academic experts and practitioners in the relevant field. The embedded predictions are likely to include a selection from both Calibration and Long-Term Forecasts. For forecasters, these represent an opportunity to shape policy and decision-making. Stay tuned for more to come on Fortified Essays.
  4. Incentive-Compatible Kelly Criterion Rules. As we move into a world where we’re hosting more tournaments, incentive compatibility within the tournament framework has become increasingly important in order to protect long-term forecast accuracy. With this aim in mind, we are very excited to introduce a novel approach to utilizing the Kelly Criterion in forecasting tournament scoring, a method with the proper incentives that respects the core Metaculus modus operandi of making forecasts with the greatest possible accuracy, rather than placing bets.

Deep Dive: Incentive-Compatible Kelly Betting Rules

  1. The denominator of the softmax function means that it’s not truly proper, but the discrepancy from properness only matters once you’ve already dominated the tournament (in which case you should be somewhat loss-averse); and
  2. It’s approximately proper in log prize, not total prize.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store