Skip to contents

[Questioning] S3 method for creating a table of summary statistics. The summary statistics can be used for presentation in tables such as table one or baseline and demography tables.

The summary statistics estimated are conditional on the variable type: continuous, binary, categorical, etc.

By default the following summary stats are calculated:

  • Numeric variables: mean, min, 25th-percentile, median, 75th-percentile, maximum, standard deviation

  • Factor variables: proportion of each factor level in the overall dataset

  • Default: number of unique values and number of missing values

Usage

get_tableone(
  data,
  strata = NULL,
  overall = TRUE,
  summary_function = summarize_short
)

# S3 method for default
get_tableone(
  data,
  strata = NULL,
  overall = TRUE,
  summary_function = summarize_short
)

Arguments

data

The dataset to summarize as dataframe or tibble

strata

Stratifying/Grouping variable name(s) as character vector. If NULL, only overall results are returned

overall

If TRUE, the summary statistics for the overall dataset are also calculated

summary_function

A function defining summary statistics for numeric and categorical values

Value

object of class tableone. That is a list of data specified summaries for all input variables.

Details

It is possible to provide your own summary function. Please have a loot at summary for inspiration.

Note

All columns in the table will be summarized. If only some columns shall be used, please select only those variables prior to creating the summary table by using dplyr::select()

Examples


# Example using the ovarian data set

survival::ovarian %>%
  dplyr::select(-fustat) %>%
  dplyr::mutate(
    age_group = factor(
      dplyr::case_when(
        age <= 50 ~ "<= 50 years",
        age <= 60 ~ "<= 60 years",
        age <= 70 ~ "<= 70 years",
        TRUE ~ "> 70 years"
      )
    ),
    rx = factor(rx),
    ecog.ps = factor(ecog.ps)
  ) %>%
  dplyr::select(age, age_group, everything()) %>%
  visR::get_tableone()
#> Warning: There was 1 warning in `summarise()`.
#>  In argument: `age_group = (function (x) ...`.
#>  In group 1: `all = "Total"`.
#> Caused by warning:
#> ! `fct_explicit_na()` was deprecated in forcats 1.0.0.
#>  Please use `fct_na_value_to_level()` instead.
#>  The deprecated feature was likely used in the visR package.
#>   Please report the issue at <https://github.com/openpharma/visR/issues>.
#> # A tibble: 21 × 3
#>    variable  statistic    Total           
#>    <chr>     <chr>        <chr>           
#>  1 Sample    N            26              
#>  2 age       Mean (SD)    56.2 (10.1)     
#>  3 age       Median (IQR) 56.8 (50.2-62.4)
#>  4 age       Min-max      38.9-74.5       
#>  5 age       Missing      0 (0%)          
#>  6 age_group <= 50 years  6 (23.1%)       
#>  7 age_group <= 60 years  13 (50.0%)      
#>  8 age_group <= 70 years  4 (15.4%)       
#>  9 age_group > 70 years   3 (11.5%)       
#> 10 futime    Mean (SD)    600 (340)       
#> # ℹ 11 more rows

# Examples using ADaM data

# display patients in an analysis set
adtte %>%
  dplyr::filter(SAFFL == "Y") %>%
  dplyr::select(TRTA) %>%
  visR::get_tableone()
#> # A tibble: 4 × 3
#>   variable statistic            Total     
#>   <chr>    <chr>                <chr>     
#> 1 Sample   N                    254       
#> 2 TRTA     Placebo              86 (33.9%)
#> 3 TRTA     Xanomeline High Dose 84 (33.1%)
#> 4 TRTA     Xanomeline Low Dose  84 (33.1%)

## display overall summaries for demog
adtte %>%
  dplyr::filter(SAFFL == "Y") %>%
  dplyr::select(AGE, AGEGR1, SEX, RACE) %>%
  visR::get_tableone()
#> # A tibble: 13 × 3
#>    variable statistic                        Total        
#>    <chr>    <chr>                            <chr>        
#>  1 Sample   N                                254          
#>  2 AGE      Mean (SD)                        75.1 (8.25)  
#>  3 AGE      Median (IQR)                     77 (70-81)   
#>  4 AGE      Min-max                          51-89        
#>  5 AGE      Missing                          0 (0%)       
#>  6 AGEGR1   <65                              33 (13.0%)   
#>  7 AGEGR1   >80                              77 (30.3%)   
#>  8 AGEGR1   65-80                            144 (56.7%)  
#>  9 SEX      F                                143 (56.3%)  
#> 10 SEX      M                                111 (43.7%)  
#> 11 RACE     AMERICAN INDIAN OR ALASKA NATIVE 1 (0.394%)   
#> 12 RACE     BLACK OR AFRICAN AMERICAN        23 (9.055%)  
#> 13 RACE     WHITE                            230 (90.551%)

## By actual treatment
adtte %>%
  dplyr::filter(SAFFL == "Y") %>%
  dplyr::select(AGE, AGEGR1, SEX, RACE, TRTA) %>%
  visR::get_tableone(strata = "TRTA")
#> # A tibble: 13 × 6
#>    variable statistic Total Placebo `Xanomeline High Dose` `Xanomeline Low Dose`
#>    <chr>    <chr>     <chr> <chr>   <chr>                  <chr>                
#>  1 Sample   N         254   86      84                     84                   
#>  2 AGE      Mean (SD) 75.1… 75.2 (… 74.4 (7.89)            75.7 (8.29)          
#>  3 AGE      Median (… 77 (… 76 (69… 76 (70.8-80)           77.5 (71-82)         
#>  4 AGE      Min-max   51-89 52-89   56-88                  51-88                
#>  5 AGE      Missing   0 (0… 0 (0%)  0 (0%)                 0 (0%)               
#>  6 AGEGR1   <65       33 (… 14 (16… 11 (13.1%)             8 (9.52%)            
#>  7 AGEGR1   >80       77 (… 30 (34… 18 (21.4%)             29 (34.52%)          
#>  8 AGEGR1   65-80     144 … 42 (48… 55 (65.5%)             47 (55.95%)          
#>  9 SEX      F         143 … 53 (61… 40 (47.6%)             50 (59.5%)           
#> 10 SEX      M         111 … 33 (38… 44 (52.4%)             34 (40.5%)           
#> 11 RACE     AMERICAN… 1 (0… NA      1 (1.19%)              NA                   
#> 12 RACE     BLACK OR… 23 (… 8 (9.3… 9 (10.71%)             6 (7.14%)            
#> 13 RACE     WHITE     230 … 78 (90… 74 (88.10%)            78 (92.86%)          

## By actual treatment, without overall
adtte %>%
  dplyr::filter(SAFFL == "Y") %>%
  dplyr::select(AGE, AGEGR1, SEX, EVNTDESC, TRTA) %>%
  visR::get_tableone(strata = "TRTA", overall = FALSE)
#> # A tibble: 12 × 5
#>    variable statistic       Placebo `Xanomeline High Dose` `Xanomeline Low Dose`
#>    <chr>    <chr>           <chr>   <chr>                  <chr>                
#>  1 Sample   N               86      84                     84                   
#>  2 AGE      Mean (SD)       75.2 (… 74.4 (7.89)            75.7 (8.29)          
#>  3 AGE      Median (IQR)    76 (69… 76 (70.8-80)           77.5 (71-82)         
#>  4 AGE      Min-max         52-89   56-88                  51-88                
#>  5 AGE      Missing         0 (0%)  0 (0%)                 0 (0%)               
#>  6 AGEGR1   <65             14 (16… 11 (13.1%)             8 (9.52%)            
#>  7 AGEGR1   >80             30 (34… 18 (21.4%)             29 (34.52%)          
#>  8 AGEGR1   65-80           42 (48… 55 (65.5%)             47 (55.95%)          
#>  9 SEX      F               53 (61… 40 (47.6%)             50 (59.5%)           
#> 10 SEX      M               33 (38… 44 (52.4%)             34 (40.5%)           
#> 11 EVNTDESC Dematologic Ev… 29 (33… 61 (72.6%)             62 (73.8%)           
#> 12 EVNTDESC Study Completi… 57 (66… 23 (27.4%)             22 (26.2%)