Fitting parameters
This tutorial shows how to perform Bayesian parameter inference on a cosmological model by fitting it to data, and how to forecast parameter constraints from a covariance matrix that describes uncertainties and correlations of (unknown) observed data.
Pantheon supernova data
We load the binned Pantheon dataset. This includes redshifts and apparent magnitudes of over 1000 Type Ia supernovae.
using DataFrames, CSV, LinearAlgebra, PDMats, SymBoltz
binned = true # use compressed dataset in the documentation
docsdir = joinpath(pkgdir(SymBoltz), "docs")
if binned # choose compressed dataset with 40 redshift bins
data = joinpath(docsdir, "Pantheon/Binned_data/lcparam_DS17f.txt")
Csyst = joinpath(docsdir, "Pantheon/Binned_data/sys_DS17f.txt")
else # choose full dataset with 1048 supernovae
data = joinpath(docsdir, "Pantheon/lcparam_full_long.txt")
Csyst = joinpath(docsdir, "Pantheon/sys_full_long.txt")
end
# Read data table
data = CSV.read(data, DataFrame, delim = " ", silencewarnings = true)
# Read covariance matrix of apparent magnitudes (mb)
Csyst = CSV.read(Csyst, DataFrame, header = false) # long vector
Csyst = collect(reshape(Csyst[2:end, 1], (Int(Csyst[1, 1]), Int(Csyst[1, 1])))) # to matrix
Cstat = Diagonal(data.dmb)^2 # TODO: should this be squared?
C = Csyst + Cstat
# Sort data and covariance matrix with decreasing redshift
is = sortperm(data, :zcmb, rev = true)
C = C[is, is]
C = PDMat(Symmetric(C)) # efficient sym-pos-def matrix with Cholesky factorization
data = data[is, :]
first(data, 10) # show 10 first rows
Row | #name | zcmb | zhel | dz | mb | dmb | x1 | dx1 | color | dcolor | 3rdvar | d3rdvar | cov_m_s | cov_m_c | cov_s_c | set | ra | dec | biascor | Column20 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Int64 | Float64 | Float64 | Float64 | Float64 | Float64 | Int64 | Int64 | Int64 | Int64 | Int64 | Int64 | Int64 | Int64 | Int64 | Int64 | Int64 | Int64 | Missing | Missing | |
1 | 39 | 1.6123 | 1.6123 | 0.0 | 25.926 | 0.0735 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | missing | missing |
2 | 38 | 1.2336 | 1.2336 | 0.0 | 25.304 | 0.05635 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | missing | missing |
3 | 37 | 0.9511 | 0.9511 | 0.0 | 24.6411 | 0.0276 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | missing | missing |
4 | 36 | 0.821 | 0.821 | 0.0 | 24.2446 | 0.0248 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | missing | missing |
5 | 35 | 0.724 | 0.724 | 0.0 | 23.8666 | 0.027 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | missing | missing |
6 | 34 | 0.6299 | 0.6299 | 0.0 | 23.5036 | 0.031 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | missing | missing |
7 | 33 | 0.5742 | 0.5742 | 0.0 | 23.3004 | 0.0245 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | missing | missing |
8 | 32 | 0.5174 | 0.5174 | 0.0 | 23.0012 | 0.02685 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | missing | missing |
9 | 31 | 0.4738 | 0.4738 | 0.0 | 22.7974 | 0.02935 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | missing | missing |
10 | 30 | 0.4355 | 0.4355 | 0.0 | 22.5579 | 0.0254 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | missing | missing |
Let us plot the apparent magnitude as a function of redshift, and the covariance matrix:
using CairoMakie
fig = Figure(size = (600, 800))
ax1 = Axis(fig[1, 1:2], xlabel = "z", ylabel = "m", title = "Apparent brightness vs. redshift")
scatter!(ax1, data.zcmb, data.mb; markersize = 5, label = "data (Pantheon)")
errorbars!(ax1, data.zcmb, data.mb, data.dmb; linewidth = 1, whiskerwidth = 5)
ax2 = Axis(fig[2, 1]; xlabel = "z", ylabel = "z", title = "Covariance matrix", yreversed = true, aspect = 1)
hm = heatmap!(ax2, extrema(data.zcmb), extrema(data.zcmb), C; colormap = :balance, colorrange = (-0.001, +0.001))
Colorbar(fig[2, 2], hm)
fig

Predicting distances
To predict luminosity distances theoretically, we solve the w0waCDM model:
using SymBoltz
g = SymBoltz.metric()
K = SymBoltz.curvature(g)
X = SymBoltz.w0wa(g; analytical = true)
M = RMΛ(K = K, Λ = X)
M = complete(SymBoltz.background(M); flatten = false)
M = change_independent_variable(M, M.g.a; add_old_diff = true)
pars_fixed = Dict(M.τ => 0.0, M.r.T₀ => NaN, M.X.cₛ² => NaN)
pars_varying = [M.r.Ω₀, M.m.Ω₀, M.K.Ω₀, M.g.h, M.X.w0, M.X.wa]
dL = SymBoltz.distance_luminosity_function(M, pars_fixed, pars_varying, data.zcmb)
μ(p) = 5 * log10.(dL(p)[begin:end-1] / (10*SymBoltz.pc)) # distance modulus
# Show example predictions
Mb = -19.3 # absolute supernova brightness (constant since SN-Ia are standard candles)
bgopts = (alg = SymBoltz.Tsit5(), reltol = 1e-5, maxiters = 1e3)
p0 = [9.3e-5, 0.3, 0.0, 0.7, -1.0, 0.0] # fiducial parameters
μs = μ(p0)
mbs = μs .+ Mb
lines!(ax1, data.zcmb, mbs; color = :black, label = "theory (ΛCDM)")
axislegend(ax1, position = :rb)
fig

Bayesian inference
To perform bayesian inference, we define a probabilistic model in Turing.jl:
using Turing
@model function supernova(μ_pred, mbs, C; Mb = Mb, Ωr0 = 9.3e-5)
# Parameter priors
h ~ Uniform(0.1, 1.0)
Ωm0 ~ Uniform(0.0, 1.0)
Ωk0 ~ Uniform(-1.0, +1.0)
w0 ~ Uniform(-2.0, 0.0)
wa ~ Uniform(-1.0, +1.0)
p = [Ωr0, Ωm0, Ωk0, h, w0, wa]
μs_pred = μ_pred(p)
if isempty(μs_pred)
Turing.@addlogprob! -Inf
return nothing
end
mbs_pred = μs_pred .+ Mb
return mbs ~ MvNormal(mbs_pred, C) # read "measurements sampled from multivariate normal with predictions and covariance matrix"
# equivalently:
#Δmb = mbs .- mbs_pred
#χ² = transpose(Δmb) * invC * Δmb
#Turing.@addlogprob! -1/2 * χ²
#return nothing
end
# https://github.com/JuliaStats/Distributions.jl/issues/1964 # TODO: get rid of? PR?
function MvNormal(μ::AbstractVector{<:Real}, Σ::AbstractPDMat{<:Real})
R = Base.promote_eltype(μ, Σ)
Distributions.MvNormal{R, typeof(Σ), typeof(μ)}(μ, Σ)
end
function MvNormal(μ, Σ)
return Distributions.MvNormal(μ, Σ)
end
sn_w0waCDM = supernova(μ, data.mb, C);
sn_ΛCDM = fix(sn_w0waCDM, w0 = -1.0, wa = 0.0);
We can now sample from the model to obtain a MCMC chain for the ΛCDM model:
chain = sample(sn_ΛCDM, NUTS(), 1000; initial_params = (h = 0.5, Ωm0 = 0.5, Ωk0 = 0.0))
import Plots, StatsPlots # don't collide with Makie
Plots.plot(chain)
Finally, we can visualize the high-dimensional chain with a corner plot using PairPlots.jl:
using PairPlots
layout = (
PairPlots.Scatter(),
PairPlots.Contourf(sigmas = 1:2),
PairPlots.MarginHist(),
PairPlots.MarginDensity(color = :black),
PairPlots.MarginQuantileText(color = :black, font = :regular),
PairPlots.MarginQuantileLines(),
)
pp = pairplot(chain => layout)

We can easily repeat this for another model:
sn_w0CDM_flat = fix(sn_w0waCDM, Ωk0 = 0.0, wa = 0.0);
chain = sample(sn_w0CDM_flat, NUTS(), 1000; initial_params = (h = 0.5, Ωm0 = 0.5, w0 = -1.0))
pp = pairplot(chain => layout)

Forecasting
Planning future cosmological surveys involves forecasting the precision of the constraints they are believed to place on parameters. Here we show how one can perform forecasting by combining SymBoltz.jl with Turing.jl.
To start, create another probabilistic supernova model, but instead of observed luminosity distances, we now use simulated luminosity distances in a fiducial cosmology:
sn_fc_w0waCDM = supernova(μ, mbs, Diagonal(C));
sn_fc_w0CDM_flat = fix(sn_fc_w0waCDM, Ωk0 = 0.0, wa = 0.0);
MCMC-driven forecasting
A general assumption-free, but expensive method to perform forecasting is to explore the likelihood using MCMC (as before, only against simulated data):
chain_fc = sample(sn_fc_w0CDM_flat, NUTS(), 1000)
Plots.plot(chain_fc)
pars0 = Dict(pars_varying .=> p0)
truth = PairPlots.Truth((h = pars0[M.g.h], Ωm0 = pars0[M.m.Ω₀], w0 = pars0[M.X.w0]))
pp_fc = pairplot(chain_fc => layout, truth)

Fisher forecasting
A less general, but cheaper method to perform forecasting is use a second-order approximation of the likelihood around the mean of the probability distribution (i.e. maximum of the likelihood). This technique is called Fisher forecasting, and requires calculation of the Fisher (information) matrix
\[Fᵢⱼ = -\left⟨\frac{∂^2 \log P(θ)}{∂θᵢ \, ∂θⱼ}\right⟩.\]
This effectively measures how quickly the likelihood falls from the maximum in different directions in parameter space. Under certain assumptions, the Fisher matrix is the inverse of the covariance matrix $C = F⁻¹$.
First, we ask Turing to estimate the maximum likelihood mode of the probabilistic model:
maxl_fc = maximum_likelihood(sn_fc_w0CDM_flat; initial_params = [0.5, 0.5, -1.0]) # TODO: or MAP?
ModeResult with maximized lp of 101.16
[0.6999998423648631, 0.29999926589213133, -0.9999967165190045]
As expected, the maximum likelihood corresponds to our chosen fiducial parameters. Nevertheless, the returned mode estimate object offers convenience methods that greatly simplifies our following calculations. Next, we calculate the Fisher matrix from the maximum likelihood, and invert it to get the covariance matrix:
using StatsBase
F_fc = informationmatrix(maxl_fc)
C_fc = inv(F_fc)
3×3 Named Matrix{Float64}
A ╲ B │ h Ωm0 w0
──────┼─────────────────────────────────────────
h │ 1.60344e-5 0.000141453 -0.000493181
Ωm0 │ 0.000141453 0.00351769 -0.00921492
w0 │ -0.000493181 -0.00921492 0.0259976
Finally, we derive 68% (1σ) and 95% (2σ) confidence ellipses from the covariance matrix and draw them onto our corner plot:
function ellipse(C::Matrix, i, j, c = (0.0, 0.0); nstd = 1, N = 33)
σᵢ², σⱼ², σᵢⱼ = C[i,i], C[j,j], C[i,j]
θ = (atan(2σᵢⱼ, σᵢ²-σⱼ²)) / 2
a = √((σᵢ²+σⱼ²)/2 + √((σᵢ²-σⱼ²)^2/4+σᵢⱼ^2))
b = √(max(0.0, (σᵢ²+σⱼ²)/2 - √((σᵢ²-σⱼ²)^2/4+σᵢⱼ^2)))
a *= nstd # TODO: correct?
b *= nstd # TODO: correct?
cx, cy = c
ts = range(0, 2π, length=N)
xs = cx .+ a*cos(θ)*cos.(ts) - b*sin(θ)*sin.(ts)
ys = cy .+ a*sin(θ)*cos.(ts) + b*cos(θ)*sin.(ts)
return xs, ys
end
for i in eachindex(IndexCartesian(), C_fc)
ix, iy = i[1], i[2]
ix >= iy && continue
μx = maxl_fc.values[ix]
μy = maxl_fc.values[iy]
for nstd in 1:2
xs, ys = ellipse(C_fc.array, ix, iy, (μx, μy); nstd)
lines!(pp_fc[iy,ix], xs, ys; color = :red)
end
end
pp_fc
