evenly distributes a number of given patients across a number of given sites. Then simulates ae development of each patient reducing the number of reported AEs for patients distributed to AE-under-reporting sites.
Usage
sim_test_data_study(
n_pat = 1000,
n_sites = 20,
frac_site_with_ur = 0,
ur_rate = 0,
max_visit_mean = 20,
max_visit_sd = 4,
ae_per_visit_mean = 0.5,
ae_rates = NULL
)
Arguments
- n_pat
integer, number of patients, Default: 1000
- n_sites
integer, number of sites, Default: 20
- frac_site_with_ur
fraction of AE under-reporting sites, Default: 0
- ur_rate
AE under-reporting rate, will lower mean ae per visit used to simulate patients at sites flagged as AE-under-reporting. Negative Values will simulate over-reporting., Default: 0
- max_visit_mean
mean of the maximum number of visits of each patient, Default: 20
- max_visit_sd
standard deviation of maximum number of visits of each patient, Default: 4
- ae_per_visit_mean
mean ae per visit per patient, Default: 0.5
- ae_rates
vector with visit-specific ae rates, Default: Null
Value
tibble with columns site_number, patnum, is_ur, max_visit_mean, max_visit_sd, ae_per_visit_mean, visit, n_ae
Details
maximum visit number will be sampled from normal distribution with characteristics derived from max_visit_mean and max_visit_sd, while the ae per visit will be sampled from a poisson distribution described by ae_per_visit_mean.
Examples
set.seed(1)
df_visit <- sim_test_data_study(n_pat = 100, n_sites = 5)
df_visit[which(df_visit$patnum == "P000001"),]
#> # A tibble: 17 × 8
#> patnum site_number is_ur max_visit_mean max_visit_sd ae_per_visit_mean visit
#> <chr> <chr> <lgl> <dbl> <dbl> <dbl> <int>
#> 1 P000001 S0001 FALSE 20 4 0.5 1
#> 2 P000001 S0001 FALSE 20 4 0.5 2
#> 3 P000001 S0001 FALSE 20 4 0.5 3
#> 4 P000001 S0001 FALSE 20 4 0.5 4
#> 5 P000001 S0001 FALSE 20 4 0.5 5
#> 6 P000001 S0001 FALSE 20 4 0.5 6
#> 7 P000001 S0001 FALSE 20 4 0.5 7
#> 8 P000001 S0001 FALSE 20 4 0.5 8
#> 9 P000001 S0001 FALSE 20 4 0.5 9
#> 10 P000001 S0001 FALSE 20 4 0.5 10
#> 11 P000001 S0001 FALSE 20 4 0.5 11
#> 12 P000001 S0001 FALSE 20 4 0.5 12
#> 13 P000001 S0001 FALSE 20 4 0.5 13
#> 14 P000001 S0001 FALSE 20 4 0.5 14
#> 15 P000001 S0001 FALSE 20 4 0.5 15
#> 16 P000001 S0001 FALSE 20 4 0.5 16
#> 17 P000001 S0001 FALSE 20 4 0.5 17
#> # ℹ 1 more variable: n_ae <int>
df_visit <- sim_test_data_study(n_pat = 100, n_sites = 5,
frac_site_with_ur = 0.2, ur_rate = 0.5)
df_visit[which(df_visit$patnum == "P000001"),]
#> # A tibble: 23 × 8
#> patnum site_number is_ur max_visit_mean max_visit_sd ae_per_visit_mean visit
#> <chr> <chr> <lgl> <dbl> <dbl> <dbl> <int>
#> 1 P000001 S0001 TRUE 20 4 0.25 1
#> 2 P000001 S0001 TRUE 20 4 0.25 2
#> 3 P000001 S0001 TRUE 20 4 0.25 3
#> 4 P000001 S0001 TRUE 20 4 0.25 4
#> 5 P000001 S0001 TRUE 20 4 0.25 5
#> 6 P000001 S0001 TRUE 20 4 0.25 6
#> 7 P000001 S0001 TRUE 20 4 0.25 7
#> 8 P000001 S0001 TRUE 20 4 0.25 8
#> 9 P000001 S0001 TRUE 20 4 0.25 9
#> 10 P000001 S0001 TRUE 20 4 0.25 10
#> # ℹ 13 more rows
#> # ℹ 1 more variable: n_ae <int>
ae_rates <- c(0.7, rep(0.5, 8), rep(0.3, 5))
sim_test_data_study(n_pat = 100, n_sites = 5, ae_rates = ae_rates)
#> # A tibble: 1,968 × 8
#> patnum site_number is_ur max_visit_mean max_visit_sd ae_per_visit_mean visit
#> <chr> <chr> <lgl> <dbl> <dbl> <dbl> <int>
#> 1 P000001 S0001 FALSE 20 4 0.443 1
#> 2 P000001 S0001 FALSE 20 4 0.443 2
#> 3 P000001 S0001 FALSE 20 4 0.443 3
#> 4 P000001 S0001 FALSE 20 4 0.443 4
#> 5 P000001 S0001 FALSE 20 4 0.443 5
#> 6 P000001 S0001 FALSE 20 4 0.443 6
#> 7 P000001 S0001 FALSE 20 4 0.443 7
#> 8 P000001 S0001 FALSE 20 4 0.443 8
#> 9 P000001 S0001 FALSE 20 4 0.443 9
#> 10 P000001 S0001 FALSE 20 4 0.443 10
#> # ℹ 1,958 more rows
#> # ℹ 1 more variable: n_ae <int>