Bayesian multiple imputation

Bayesian multiple imputation has got the spirit of the Bayesian framework. It is required to specify a parametric model for the complete data and a prior distribution over unknown model parameters, θ. Subsequently, m independent trials are drawn from the missing data, as given by the observed data using Bayes' Theorem. Markov Chain Monte Carlo can be used to simulate the entire joint posterior distribution of the missing data. BMI follows a normal distribution while generating imputations for the missing values.

Let's say that the data is as follows:

Y = (Yobs, Ymiss),

Here, Yobs is the observed Y and Ymiss is the missing Y.

If P(Y|θ) is the parametric model, the parameter θ is the mean and the covariance matrix that parameterizes a normal distribution. If this is the case, let P(θ) be the prior:

Let's make use of the Amelia package in R and execute this:

library(foreign)
dataset = read.spss("World95.sav", to.data.frame=TRUE)

library(Amelia)

myvars <- names(dataset) %in% c("COUNTRY", "RELIGION", "REGION","CLIMATE") 
newdata <- dataset[!myvars]

Now, let's make the imputation:

impute.out <- amelia(newdata, m=4)