Notes for Missing Data in Longitudinal Studies by Mike Daniels

MTM

$logit(\mu_{ij}) = x\beta$
$\log(\phi) = \delta + y\gamma_{j}$

Misc

missing assumption can not be verified
when posterior is not closed form, use sample from posterior
use diffuse prior
prior can be improper, but posterior should be proper or just proper
in complete data, variance parameter is ancillary but important for efficiency. In missing data scenario, variance parameter is important for consistency.
For missing data, Bayesian allow assumptions about nonidentified parameters and uncertainty about the assumption. While frequentist uses fixed value or constraints.

Some Recommendation

$\sigma$: truncated uniform or folded normal
$\tau$ : uniform shrinkage
$\Sigma$ : flat prior , reference prior, just proper wishart. Since improper prior can not be used in Winbugs, so use just proper priors.

Posterior Consistency

$p(\theta

y) \to \theta_{0}$ which is a point mass (ture value) as $n \to \infty$ the sample size goes to infinity

Sampling

Gibbs

each can be conjugate
If not random walk, optimal acceptance rate could be 100%
Choose normal approximation or Laplace approximation for full conditional distribution

Recommendation:

random walk for 1 parameter and which is not expensive
when not expensive, can be applied on full distribution
works on heavy tailed well

Data Augmentation

What we want is $p(\theta

y)$, use auxiliary variable z:

$p(z \theta, y)$
$p(\theta z, y)$

If there is latent variable, it suit better (probit, random effect)
often used for multivariate t distribution

Inference

After getting samples, the problem is how to use samples to get inference

discard K (burn-in)
multiple chain
autocorrelated (lag k) (lag 10 is fine)
block gibbs or non-random walk used to speed mixing by reducing autocorrelation
thinning, but inefficient
batching is more efficient

Model Selection

DIC

Drawbacks:

likelihood based
not invariant
no closed likelihood

PPL

Nonparametric

$p(y \theta)$
$\theta \sim G(\theta)$
$G \sim DP(G_{0}, \alpha)$

$\alpha$ controls how similar nonparametric G is close to $G_{0}$. $\alpha \to \infty \rightarrow G \to G_{0}$.

can check random effect distribution

Missing

Full data : $p(y, r \theta, \omega)$
Full data response model : $p(y x, \theta)$

MAR

For monotone dropout, MAR is equivalent to $P(U=j

Y) = P(U=j

Y_{j})$, which means the hazard is only relevant to previous observed one.

MAR + Ignorability goes to simplification

By $p(y_{2}

y_{1}, r= 0, \theta) = P(y_{2}

y_{1}, r = 1, \theta)$, it is easy to impute data and then model regression from completed data

MTM not suitable
Example 5.9: MAR can not assure ignorability

MNAR (Non ignorable)##

use unvarified parametric assumption
use informative prior

Selection Model

$y, r = p(y)p(r

y)$

Pros:

$y x$
$r y$
when monotone dropout, $r y$ is hazard function

Cons:

sensitive to model specification
sometimes identification problem

Example:

$g(\pi_{i}) = \phi_{0} + \phi_{1}y_{1} + \phi_{2}y_{2}$

if $\phi_{2} = 0$ is equivalent to MAR

cons:

assumptions can not be verified
potential sensitivity problem

Mixture Model

$p(y, r) = p(y

r,x,\omega)p(r

x, \omega)$

cons:

$y r$ is sensible

$p(y_{2},y_{1}

r, \alpha) = p(y_{2}

y_{1}, r, \alpha_{E})p(y_{1}

r, \alpha_{o})$

In general, $\alpha_{E}$ can not fully identified from the data alone.

$\alpha_{E}, \alpha_{O}$ may overlap. So assume

$\alpha_{E} = (\alpha_{EI},\alpha_{NI}$

where $\alpha_{NI}$ can be used to do sensitivity analysis, or use informative prior

Example: bivariate normal

cons:

when dimension of observations goes up, number of $\alpha_{NI}$ goes up too

Shared Parameter Model

Ignorability

$p(ymis yobs, \theta)$ using Data augmentation
$p(\theta y)$ using gibbs sampler

Strongly rely on the assumption of $p(y

\theta)$

Misspecification leads to inconsistency.

GARP/IV

for ignorable data, only need to specify $p(y \omega)$
for non-ignorable: $p(y, r \omega)$ is mandatory

NONIGNORABLE

sensitivity parameter and sensitivity analysis
informative prior
not identifiable from observed data but when they are fixed, remainder is identified.

Selection Model

parametric model are not suitable to sensitivity analysis

Semiparametric Selection Model

parametric form for $r y$
non or semi-parametric form for $y \omega$ or $y,r \omega$
informative prior on $\phi_{2}$

pros:

specify $r y, y \omega$, and $\beta$ explicit

cons:

sensitivity analysis

Mixture Model

$p(y

\omega,x) = \sum_{r}(y,r

\omega, x)$

Identification strategy:

Usually mixture model is used to deal with MNAR.

For monotone dropout, MAR is equivalent to ACMV constraints.

Table here.

Interior Family Constraints (IFC)

complete case missing value (CCMV)
nearest-neighbour (NN)
available case (ACMV) , for monotone dropout, MAR == ACMV
non future dependence missing value restrictions
identification via extrapolation: $\Sigma$ is common across pattern , then identified

Mixture Model with discrete time dropout

Pattern Mixture Model with Covariates

linear link: $\beta = \sum \phi_s\beta^{s}$
nonlinear link: $\beta = \sum_{s} \partial{1}{x} g^{-1}(x\beta^{s}) \phi_{s}(x)$, when $\phi_{s}(x)$ does not depend on x.

So when dealing with mixture model with covariates, be sure to check if

mean is linear in covariates (identical link or something)
missingness depends on covariates: $r x$
covariates effects are time varying

Shared Parameter Model

b$ is independent with $r

$y x,z \sim N(x\beta+z,\sigma^2)$
$h_u(t z) = h_{0}(t)\exp(z\gamma)$

if $\gamma = 0 $ , then MAR, otherwise, MNAR

pros:

good at hazard
complex data structure
latent variable
good at jointly analyzing repeated measure and event time

cons:

$y u$ is not explicit ($p(u y)$ or p(y u))
$h_{u}$ depend on future observations
hard to separate p(ymis yobs, u), ie, hard to embed MAR
rely on $b$ distribution

Model Selection in Nonignorable Models

based on p(y,r	$\omega$) , instead of p(y	$\theta$). While for
ignorable data, based on p(y	$\theta$)

DIC : can be obtained in winbugs
PPL: posterior predictive checks

Ch9: Informative Priors and Sensitivity Analysis

sensitivity for UN-verifiable missing assumption
incorporate subjective belief

Pattern Mixture Model

$p(y, r

\alpha) = p(y

r,)p(r) = p(ymis

yobs, r)p(yobs

r) p(r)$

Sensitivity Analysis

fix value constant
exam inference across range
assign appropriate prior
prefer mixture model than selection model
freq: fix $\delta$ and see how inference change with $\delta$
Bayesian: specify priors on $\delta$ based on the belief of missing mechanism.

Examples

Spcify Priors

MAR with no uncertainty
MAR with uncertainty
MNAR with no uncertainty
MNAR with uncertainty

Examples

← Previous Archive Next →

Published

12 February 2013

notes for mike book

Notes for Missing Data in Longitudinal Studies by Mike Daniels

MTM

Misc

Some Recommendation

Posterior Consistency

Sampling

Gibbs

Data Augmentation

Inference

Model Selection

DIC

PPL

Nonparametric

Missing

MAR

MNAR (Non ignorable)##

Selection Model

Mixture Model

Shared Parameter Model

Ignorability

NONIGNORABLE

Selection Model

Semiparametric Selection Model

Mixture Model

Interior Family Constraints (IFC)

Mixture Model with discrete time dropout

Pattern Mixture Model with Covariates

Shared Parameter Model

Model Selection in Nonignorable Models

Ch9: Informative Priors and Sensitivity Analysis

Pattern Mixture Model

Sensitivity Analysis

Spcify Priors

Published

Tags