Clean and annotate the data to prepare it for modeling.
Arguments
- data
A
tibbleor data frame with one row per patient visit. See the arguments below for specific requirements for the data.- outcome
Character string, name of the column in the data with the numeric outcome variable on the continuous scale. Could be a clinical measure of healthy or of disease severity. Baseline is part of the model, so the
outcomeshould not already be a change from baseline. The vector of outcomes may have missing values, either with explicitNAs or with rows in the data missing for one or more visits.- time
Character string, name of the column in the data with the numeric time variable on the continuous scale. This time is the time since enrollment/randomization of each patient. A time value of 0 should indicate baseline.
- patient
Character string, name of the column in the data with the patient ID. This vector could be a numeric, integer, factor, or character vector.
pmrmautomatically converts it into an unordered factor.- visit
Character string, name of the column in the data which indicates the study visit of each row. This column could be a numeric, integer, factor, or character vector. An ordered factor is highly recommended because
pmrmwith levels assumed to be in chronological order. The minimum visit must be baseline.- arm
Character string, name of the column in the data which indicates the study arm of each row. This column could be a numeric, integer, factor, or character vector. An ordered factor is highly recommended because
pmrmautomatically convertsdata[[arm]]into an ordered factor anyway. The minimum level is assumed to be the control arm.- covariates
Partial right-sided formula of concomitant terms in the model for covariate adjustment (e.g. by age, gender, biomarker status, etc.). Usually does not include main variables such as the values of
outcome,time,patient,visit, orarm. (If you do include any of these variables, be sure to check the fitted model for identifiability problems.)The columns in the data referenced in the formula must not have any missing values.
Set
covariatesto~ 0(default) to opt out of covariate adjustment. The intercept term is removed from the model matrixWwhether or not the formula begins with `~ 0.- subset
TRUEifdatais a subset of data without all levels ofvisitorarm,FALSEto expect a full dataset when validating.
Value
A tibble of class "pmrm_data"
with one row per patient visit.
Rows with missing outcomes are removed,
variables arm and visit are converted into ordered factors
(with minimum values at the control arm and baseline, respectively),
and then the rows are sorted by patient and visit.
The "labels" attribute is a named list with the values of arguments
outcome, time, visit, arm, and covariates.
