R/schema_conf.R
restricted_simulation.Rd
The functions allow to define a set of methods for simulating data using additional column-based parameters such as range or values.
opt_simul_restricted_character(
f_key = simul_restricted_character_fkey,
...,
in_set = simul_restricted_character_in_set
)
opt_simul_restricted_numeric(
f_key = simul_restricted_numeric_fkey,
...,
in_set = simul_restricted_numeric_in_set,
range = simul_restricted_numeric_range
)
opt_simul_restricted_integer(
f_key = simul_restricted_integer_fkey,
...,
in_set = simul_restricted_integer_in_set,
range = simul_restricted_integer_range
)
opt_simul_restricted_logical(f_key = simul_restricted_integer_fkey, ...)
opt_simul_restricted_date(
f_key = simul_restricted_integer_fkey,
...,
range = simul_restricted_date_range
)
Method for simulating foreign key columns. The values
parameter of the function,
receives all the unique values from parent primary key column.
Other methods that can be defined to handle extra parameters.
Method for simulating columns from defined set of values. The values
parameter of the function, take all the values defined in YAML column definition as values
parameter.
Method for simulating columns fitting inside defined range. It takes special parameter
range
2-length vector minimum and maximum value for simulated data.
Except for the standard column parameters, that are now:
type
unique
not_null
default
nchar
min_date
max_date
precision
scale
it is also allowed to add custom ones (either directly in YAML configuration file,
or in opt_default_<column_type>
functions).
In order to respect simulation using such parameters, we may want to define our custom simulation functions.
Such functions should be defined as a parameters of opt_simul_restricted_<column_type>
functions,
and each of them should take special parameter as its own one.
When the parameter condition is not met (for example the parameter is missing) such function should return NULL value. This allows the simulation workflow to move to the next defined method. The order of methods execution is followed by the order of defined parameters in the below methods.
That means, the highest priority always have f_key
- a special method that is used for foreign key
columns, and simulates only from values received from parent primary key.
The second priority method for character type columns is in_set, that seeks for values
column
parameter, and when such exists it simulates the data from defined set of values.
See simul_restricted_character_in_set
definition to check details.