Here we describe the variance-covariance matrix adjustment of coefficients.
Introduction
To estimate the covariance matrix of coefficients, there are many
ways. In mmrm
package, we implemented asymptotic,
empirical, Jackknife and Kenward-Roger methods. For simplicity, the
following derivation are all for unweighted mmrm. For weighted mmrm, we
can follow the details
of weighted least square estimator.
Asymptotic Covariance
Asymptotic covariance are derived based on the estimate of \(\beta\).
Following the definition in details in model fitting, we have
\[ \hat\beta = (X^\top W X)^{-1} X^\top W Y \]
\[ cov(\hat\beta) = (X^\top W X)^{-1} X^\top W cov(\epsilon) W X (X^\top W X)^{-1} = (X^\top W X)^{-1} \]
Where \(W\) is the block diagonal matrix of inverse of covariance matrix of \(\epsilon\).
Empirical Covariance
Empirical covariance, also known as the robust sandwich estimator, or “CR0”, is derived by replacing the covariance matrix of \(\epsilon\) by observed covariance matrix.
\[ cov(\hat\beta) = (X^\top W X)^{-1}(\sum_{i}{X_i^\top W_i \hat\epsilon_i\hat\epsilon_i^\top W_i X_i})(X^\top W X)^{-1} = (X^\top W X)^{-1}(\sum_{i}{X_i^\top L_{i} L_{i}^\top \hat\epsilon_i\hat\epsilon_i^\top L_{i} L_{i}^\top X_i})(X^\top W X)^{-1} \]
Where \(W_i\) is the block diagonal part for subject \(i\) of \(W\) matrix, \(\hat\epsilon_i\) is the observed residuals for subject i, \(L_i\) is the Cholesky factor of \(\Sigma_i^{-1}\) (\(W_i = L_i L_i^\top\)).
See the detailed explanation of these formulas in the Weighted Least Square Empirical Covariance vignette.
Jackknife Covariance
Jackknife method in mmrm
is the “leave-one-cluster-out”
method. It is also known as “CR3”. Following McCaffrey and Bell (2003), we have
\[ cov(\hat\beta) = (X^\top W X)^{-1}(\sum_{i}{X_i^\top L_{i} (I_{i} - H_{ii})^{-1} L_{i}^\top \hat\epsilon_i\hat\epsilon_i^\top L_{i} (I_{i} - H_{ii})^{-1} L_{i}^\top X_i})(X^\top W X)^{-1} \]
where
\[H_{ii} = X_i(X^\top X)^{-1}X_i^\top\]
Please note that in the paper there is an additional scale parameter \(\frac{n-1}{n}\) where \(n\) is the number of subjects, here we do not include this parameter.
Bias-Reduced Covariance
Bias-reduced method, also known as “CR2”, provides unbiased under correct working model. Following McCaffrey and Bell (2003), we have \[ cov(\hat\beta) = (X^\top W X)^{-1}(\sum_{i}{X_i^\top L_{i} (I_{i} - H_{ii})^{-1/2} L_{i}^\top \hat\epsilon_i\hat\epsilon_i^\top L_{i} (I_{i} - H_{ii})^{-1} L_{i}^\top X_i})(X^\top W X)^{-1} \]
where
\[H_{ii} = X_i(X^\top X)^{-1}X_i^\top\]
Kenward-Roger Covariance
Kenward-Roger covariance is an adjusted covariance matrix for small sample size. Details can be found in Kenward-Roger