R: Multilevel data: Bootstrapping

From MathWiki

To bootstrap multilevel data, one needs to

  1. sample from clusters with replacement
  2. select all cases from each selected cluster
  3. construct a new cluster id that is distinct for each 'copy' of multiply selected clusters

For example, if 'dd' is a data.frame with a numerical cluster id 'dd$id' and 'compute.est(data)' computes the desired estimate, then:

  ddid <- data.frame( id = unique( dd$id ) )

  fit.function <- function( data, ind) {

      sel.ids <- as.character( data[ind,'id'] )
      all.ids <- as.character( dd[,'id'])

      ind2 <- lapply( sel.ids, function(x) which( all.ids %in% x ) )  # list with indices for each selected id

      boot.id <- unlist( lapply( seq_along( ind2 ), function(i) rep(i ,length(ind2i))))  # unique id for each 'selection'

      boot.dframe <- dd[ unlist(ind2),]
      boot.dframe$id <- boot.id

      compute.est( boot.dframe )

The bootstrap is then carried out with:

  boot.obj <- boot( ddid, fit.function, 3999)
  boot.ci( boot.obj, type = 'bca' )
  plot( boot.obj )
  jack.after.boot( boot.obj )