Skip to contents

Here we describe the variance-covariance matrix adjustment of coefficients.

Introduction

To estimate the covariance matrix of coefficients, there are many ways. In mmrm package, we implemented asymptotic, empirical, Jackknife and Kenward-Roger methods. For simplicity, the following derivation are all for unweighted mmrm. For weighted mmrm, we can follow the details of weighted least square estimator.

Asymptotic Covariance

Asymptotic covariance are derived based on the estimate of β\beta.

Following the definition in details in model fitting, we have

β̂=(XWX)1XWY \hat\beta = (X^\top W X)^{-1} X^\top W Y

cov(β̂)=(XWX)1XWcov(ϵ)WX(XWX)1=(XWX)1 cov(\hat\beta) = (X^\top W X)^{-1} X^\top W cov(\epsilon) W X (X^\top W X)^{-1} = (X^\top W X)^{-1}

Where WW is the block diagonal matrix of inverse of covariance matrix of ϵ\epsilon.

Empirical Covariance

Empirical covariance, also known as the robust sandwich estimator, or “CR0”, is derived by replacing the covariance matrix of ϵ\epsilon by observed covariance matrix.

cov(β̂)=(XWX)1(iXiWiϵ̂iϵ̂iWiXi)(XWX)1=(XWX)1(iXiLiLiϵ̂iϵ̂iLiLiXi)(XWX)1 cov(\hat\beta) = (X^\top W X)^{-1}(\sum_{i}{X_i^\top W_i \hat\epsilon_i\hat\epsilon_i^\top W_i X_i})(X^\top W X)^{-1} = (X^\top W X)^{-1}(\sum_{i}{X_i^\top L_{i} L_{i}^\top \hat\epsilon_i\hat\epsilon_i^\top L_{i} L_{i}^\top X_i})(X^\top W X)^{-1}

Where WiW_i is the block diagonal part for subject ii of WW matrix, ϵ̂i\hat\epsilon_i is the observed residuals for subject i, LiL_i is the Cholesky factor of Σi1\Sigma_i^{-1} (Wi=LiLiW_i = L_i L_i^\top).

See the detailed explanation of these formulas in the Weighted Least Square Empirical Covariance vignette.

Jackknife Covariance

Jackknife method in mmrm is the “leave-one-cluster-out” method. It is also known as “CR3”. Following McCaffrey and Bell (2003), we have

cov(β̂)=(XWX)1(iXiLi(IiHii)1Liϵ̂iϵ̂iLi(IiHii)1LiXi)(XWX)1 cov(\hat\beta) = (X^\top W X)^{-1}(\sum_{i}{X_i^\top L_{i} (I_{i} - H_{ii})^{-1} L_{i}^\top \hat\epsilon_i\hat\epsilon_i^\top L_{i} (I_{i} - H_{ii})^{-1} L_{i}^\top X_i})(X^\top W X)^{-1}

where

Hii=Xi(XX)1XiH_{ii} = X_i(X^\top X)^{-1}X_i^\top

Please note that in the paper there is an additional scale parameter n1n\frac{n-1}{n} where nn is the number of subjects, here we do not include this parameter.

Bias-Reduced Covariance

Bias-reduced method, also known as “CR2”, provides unbiased under correct working model. Following McCaffrey and Bell (2003), we have cov(β̂)=(XWX)1(iXiLi(IiHii)1/2Liϵ̂iϵ̂iLi(IiHii)1LiXi)(XWX)1 cov(\hat\beta) = (X^\top W X)^{-1}(\sum_{i}{X_i^\top L_{i} (I_{i} - H_{ii})^{-1/2} L_{i}^\top \hat\epsilon_i\hat\epsilon_i^\top L_{i} (I_{i} - H_{ii})^{-1} L_{i}^\top X_i})(X^\top W X)^{-1}

where

Hii=Xi(XX)1XiH_{ii} = X_i(X^\top X)^{-1}X_i^\top

Kenward-Roger Covariance

Kenward-Roger covariance is an adjusted covariance matrix for small sample size. Details can be found in Kenward-Roger

McCaffrey, Daniel F, and Robert M Bell. 2003. “Bias Reduction in Standard Errors for Linear Regression with Multi-Stage Samples.” Quality Control and Applied Statistics 48 (6): 677–82.