Tech/Calibration Pipeline

Calibration Pipeline

Platt scaling and Aeterna Engine V8.1 CDF for well-calibrated bracket probabilities

Why Calibration Matters

Raw model outputs are not calibrated probabilities. A model that says 70% should be correct 70% of the time for optimal Kelly sizing. Without calibration, the Kelly criterion will systematically over- or under-size positions, destroying expected value even when the model has genuine edge.

Aeterna Engine V8.1 CDF Bracket Mapping

Given the predicted mean (mu) and standard deviation (sigma) from the Aeterna Engine V8.1 dual model, we compute the probability of temperature falling in each bracket using the Aeterna Engine V8.1 CDF. For a bracket [a, b], the probability is Phi((b - mu) / sigma) - Phi((a - mu) / sigma), where Phi is the standard normal CDF from scipy.stats.norm.

Uses scipy.stats.norm.cdf for precise computation

Handles edge brackets: (-inf, a] and [b, +inf)

Probabilities sum to 1.0 across all brackets

Typical markets have 9-11 temperature brackets

Platt Scaling Calibrator

After CDF mapping, a Platt scaling layer (logistic regression on log-odds) corrects for systematic biases. The calibrator is fit on a held-out validation set. This brings Brier scores from ~0.08 uncalibrated down to ~0.05 calibrated, a meaningful improvement for trading.

Logistic regression: calibrated_prob = 1 / (1 + exp(-(a * raw + b)))

Fit on held-out temporal validation data

Brier score improvement: 0.08 -> 0.05

Compact serialized model for fast inference

← Back to Stack