Today we focus on two ideas that come up constantly in disease mapping:
You often have multiple imperfect views of the same latent risk:
Separate models can disagree and waste information.
Joint models let us:
We posit a latent spatial surface (S(s)) that captures residual dependence:
In MBG terms:
\[\eta(s) = \beta_0 + \beta^\top x(s) + S(s)\]
and data are conditionally independent given (S(s)).
Suppose we have:
At each location \(s_i\), group \(g\in\{c,p\}\):
\[ Y_{ig} \mid p_{ig} \sim \text{Binomial}(N_{ig}, p_{ig}) \]
\[ \text{logit}(p_{ig}) = \eta_{ig} \]
A common and interpretable joint structure:
\[ \eta_{ig} = \beta_{0g} + \beta^\top x_i + S_0(s_i) + S_g(s_i) \]
Interpretation
This structure answers practical questions:
It also improves predictions:
Potential issue: \(S_0\) and \(S_g\) can compete.
Typical solutions:
Conceptually: shared structure should capture the dominant signal.
A standard choice: Matérn Gaussian random fields
\[ S(s) \sim \text{GRF}\left(0, \text{Matérn}(\rho, \sigma^2)\right) \]
Stationarity here means \(\rho,\sigma^2\) are constant over space.
\[ S(s) \approx \sum_{k=1}^K w_k \phi_k(s) \]
where \(w_k\) are GMRF weights.
Joint model uses the same mesh for \(S_0\) and \(S_g\), but:
Often you have:
Each dataset is incomplete on its own.
Joint modelling combines them while respecting their data-generating mechanisms.
Prevalence
\[ Y_i \mid p_i \sim \text{Binomial}(N_i, p_i),\quad \text{logit}(p_i)=\eta_i^{(B)} \]
Counts
\[ C_i \mid \lambda_i \sim \text{Poisson}(E_i\lambda_i),\quad \log(\lambda_i)=\eta_i^{(P)} \]
Different link functions:
A flexible shared-structure joint model:
\[ \eta_i^{(B)} = \beta_0^{(B)} + \beta^\top x_i + S_0(s_i) + S_B(s_i) \]
\[ \eta_i^{(P)} = \beta_0^{(P)} + \beta^\top x_i + \alpha S_0(s_i) + S_P(s_i) \]
This parameter is often crucial:
Because the two processes are not identical:
Outcome-specific fields mop up these systematic differences.
Two common practical pitfalls:
Confounding:
Mis-specified exposure \(E_i\):
Different spatial supports:
Solutions:
Stationary Matérn assumes:
But malaria transmission may be:
Non-stationarity: the covariance structure changes over space.
Both are useful; choose based on:
Let:
Define:
\[ S(s)=\omega(s)S_{\text{smooth}}(s) + (1-\omega(s))S_{\text{rough}}(s) \]
Then:
\[ \eta(s)=\beta_0+\beta^\top x(s)+S(s) \]
Interpretation - \(\omega(s)\) is a nonstationarity driver.
Simple choices:
In the practical session:
Advantages:
Limitations:
Recall: for Matérn SPDE models, range is approximately:
\[ \rho(s)\approx \frac{\sqrt{8}}{\kappa(s)} \]
So if we model:
\[ \log \kappa(s)=\theta_0+\theta_1 z(s) \]
then
\[ \rho(s)\text{ varies smoothly with }z(s). \]
Because \(\rho(s)\propto 1/\kappa(s)\):
So you can encode statements like:
In the SPDE approximation:
Thus \(z(s)\) is evaluated at mesh nodes:
This makes the range a spatially varying parameter.
Stationary SPDE:
Structured-range SPDE:
Conceptual result:
In SPDE Matérn models:
So structured modelling can target:
Today we emphasise structured range.
In interpretation:
Joint models - share spatial information across related outcomes - can handle: - multiple groups (same likelihood) - multiple likelihoods (binomial + poisson)
Nonstationarity - mixture SPDE: simple and robust - structured range SPDE: mechanistic, covariate-driven
Take-away - choose the simplest model that answers your scientific question - use nonstationarity when stationary assumptions are visibly violated