S3 method for creating a table of summary statistics. The summary statistics can be used for presentation in tables such as table one or baseline and demography tables.
The summary statistics estimated are conditional on the variable type: continuous, binary, categorical, etc.
By default the following summary stats are calculated:
Numeric variables: mean, min, 25th-percentile, median, 75th-percentile, maximum, standard deviation
Factor variables: proportion of each factor level in the overall dataset
Default: number of unique values and number of missing values
Usage
get_tableone(
data,
strata = NULL,
overall = TRUE,
summary_function = summarize_short
)
# S3 method for default
get_tableone(
data,
strata = NULL,
overall = TRUE,
summary_function = summarize_short
)
Arguments
- data
The dataset to summarize as dataframe or tibble
- strata
Stratifying/Grouping variable name(s) as character vector. If NULL, only overall results are returned
- overall
If TRUE, the summary statistics for the overall dataset are also calculated
- summary_function
A function defining summary statistics for numeric and categorical values
Details
It is possible to provide your own summary function. Please have a loot at summary for inspiration.
Note
All columns in the table will be summarized. If only some columns shall be used, please select only those variables prior to creating the summary table by using dplyr::select()
Examples
# Example using the ovarian data set
survival::ovarian %>%
dplyr::select(-fustat) %>%
dplyr::mutate(
age_group = factor(
dplyr::case_when(
age <= 50 ~ "<= 50 years",
age <= 60 ~ "<= 60 years",
age <= 70 ~ "<= 70 years",
TRUE ~ "> 70 years"
)
),
rx = factor(rx),
ecog.ps = factor(ecog.ps)
) %>%
dplyr::select(age, age_group, everything()) %>%
visR::get_tableone()
#> Warning: There was 1 warning in `summarise()`.
#> ℹ In argument: `age_group = (function (x) ...`.
#> ℹ In group 1: `all = "Total"`.
#> Caused by warning:
#> ! `fct_explicit_na()` was deprecated in forcats 1.0.0.
#> ℹ Please use `fct_na_value_to_level()` instead.
#> ℹ The deprecated feature was likely used in the visR package.
#> Please report the issue at <https://github.com/openpharma/visR/issues>.
#> # A tibble: 21 × 3
#> variable statistic Total
#> <chr> <chr> <chr>
#> 1 Sample N 26
#> 2 age Mean (SD) 56.2 (10.1)
#> 3 age Median (IQR) 56.8 (50.2-62.4)
#> 4 age Min-max 38.9-74.5
#> 5 age Missing 0 (0%)
#> 6 age_group <= 50 years 6 (23.1%)
#> 7 age_group <= 60 years 13 (50.0%)
#> 8 age_group <= 70 years 4 (15.4%)
#> 9 age_group > 70 years 3 (11.5%)
#> 10 futime Mean (SD) 600 (340)
#> # ℹ 11 more rows
# Examples using ADaM data
# display patients in an analysis set
adtte %>%
dplyr::filter(SAFFL == "Y") %>%
dplyr::select(TRTA) %>%
visR::get_tableone()
#> # A tibble: 4 × 3
#> variable statistic Total
#> <chr> <chr> <chr>
#> 1 Sample N 254
#> 2 TRTA Placebo 86 (33.9%)
#> 3 TRTA Xanomeline High Dose 84 (33.1%)
#> 4 TRTA Xanomeline Low Dose 84 (33.1%)
## display overall summaries for demog
adtte %>%
dplyr::filter(SAFFL == "Y") %>%
dplyr::select(AGE, AGEGR1, SEX, RACE) %>%
visR::get_tableone()
#> # A tibble: 13 × 3
#> variable statistic Total
#> <chr> <chr> <chr>
#> 1 Sample N 254
#> 2 AGE Mean (SD) 75.1 (8.25)
#> 3 AGE Median (IQR) 77 (70-81)
#> 4 AGE Min-max 51-89
#> 5 AGE Missing 0 (0%)
#> 6 AGEGR1 <65 33 (13.0%)
#> 7 AGEGR1 >80 77 (30.3%)
#> 8 AGEGR1 65-80 144 (56.7%)
#> 9 SEX F 143 (56.3%)
#> 10 SEX M 111 (43.7%)
#> 11 RACE AMERICAN INDIAN OR ALASKA NATIVE 1 (0.394%)
#> 12 RACE BLACK OR AFRICAN AMERICAN 23 (9.055%)
#> 13 RACE WHITE 230 (90.551%)
## By actual treatment
adtte %>%
dplyr::filter(SAFFL == "Y") %>%
dplyr::select(AGE, AGEGR1, SEX, RACE, TRTA) %>%
visR::get_tableone(strata = "TRTA")
#> # A tibble: 13 × 6
#> variable statistic Total Placebo `Xanomeline High Dose` `Xanomeline Low Dose`
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 Sample N 254 86 84 84
#> 2 AGE Mean (SD) 75.1… 75.2 (… 74.4 (7.89) 75.7 (8.29)
#> 3 AGE Median (… 77 (… 76 (69… 76 (70.8-80) 77.5 (71-82)
#> 4 AGE Min-max 51-89 52-89 56-88 51-88
#> 5 AGE Missing 0 (0… 0 (0%) 0 (0%) 0 (0%)
#> 6 AGEGR1 <65 33 (… 14 (16… 11 (13.1%) 8 (9.52%)
#> 7 AGEGR1 >80 77 (… 30 (34… 18 (21.4%) 29 (34.52%)
#> 8 AGEGR1 65-80 144 … 42 (48… 55 (65.5%) 47 (55.95%)
#> 9 SEX F 143 … 53 (61… 40 (47.6%) 50 (59.5%)
#> 10 SEX M 111 … 33 (38… 44 (52.4%) 34 (40.5%)
#> 11 RACE AMERICAN… 1 (0… NA 1 (1.19%) NA
#> 12 RACE BLACK OR… 23 (… 8 (9.3… 9 (10.71%) 6 (7.14%)
#> 13 RACE WHITE 230 … 78 (90… 74 (88.10%) 78 (92.86%)
## By actual treatment, without overall
adtte %>%
dplyr::filter(SAFFL == "Y") %>%
dplyr::select(AGE, AGEGR1, SEX, EVNTDESC, TRTA) %>%
visR::get_tableone(strata = "TRTA", overall = FALSE)
#> # A tibble: 12 × 5
#> variable statistic Placebo `Xanomeline High Dose` `Xanomeline Low Dose`
#> <chr> <chr> <chr> <chr> <chr>
#> 1 Sample N 86 84 84
#> 2 AGE Mean (SD) 75.2 (… 74.4 (7.89) 75.7 (8.29)
#> 3 AGE Median (IQR) 76 (69… 76 (70.8-80) 77.5 (71-82)
#> 4 AGE Min-max 52-89 56-88 51-88
#> 5 AGE Missing 0 (0%) 0 (0%) 0 (0%)
#> 6 AGEGR1 <65 14 (16… 11 (13.1%) 8 (9.52%)
#> 7 AGEGR1 >80 30 (34… 18 (21.4%) 29 (34.52%)
#> 8 AGEGR1 65-80 42 (48… 55 (65.5%) 47 (55.95%)
#> 9 SEX F 53 (61… 40 (47.6%) 50 (59.5%)
#> 10 SEX M 33 (38… 44 (52.4%) 34 (40.5%)
#> 11 EVNTDESC Dematologic Ev… 29 (33… 61 (72.6%) 62 (73.8%)
#> 12 EVNTDESC Study Completi… 57 (66… 23 (27.4%) 22 (26.2%)