regional_benchmark.Rd
regional_benchmark()
is a wrapper function calling a number of functions. See details.
regional_benchmark( regions = c("ALLSAC", "SFE", "K", "NC", "NCC", "SCC", "SC", "SJT"), LRN_IDS, TUNELENGTH, INNER, ITERS, PROB, NU, REPS, PREPROC, FINAL, PATH, REDUCED, MES, INFO, FS, FS_NUM )
regions |
|
---|---|
LRN_IDS |
|
TUNELENGTH |
|
INNER |
|
ITERS |
|
PROB |
|
NU |
|
REPS |
|
PREPROC |
|
FINAL |
|
PATH |
|
REDUCED |
|
MES |
|
INFO |
|
FS |
|
FS_NUM |
|
a list of mlr
benchmark results
Here is a some pseudo-code that explains what is happening behind the scenes.
Skip. Because regional_benchmark()
is called inside the for
-loop for (FS_NUM in FS_NUM_LIST)
(see above), if FINAL
is #' TRUE
, regional_benchmark()
skips the region
it does not need to calculate the final models.
Data loading. This handled by get_training_data()
Data formatting. This is handled by fmt_labels()
, sanitize_data()
and get_coords()
.
Feature selection. If FINAL
is TRUE
the selected features are retrieved from get_bestBMR_tuning_results()
. If FINAL
is FALSE
the selected features are derived from transformed training data using get_ppc()
and preproc_data()
. The resulting transformed data are filtered for correlation higher than 0.95 with caret::findCorrelation()
. Then, 500 subsampled mlr
Tasks are created with mlr::makeResampleDesc()
, mlr::makeClassifTask()
, mlr::makeResampleInstance()
and mlr::filterFeatures()
. The #' FS_NUM
most commonly select features across the 500 realizations are selected.
Pre-processing. The target and training data are transformed using get_ppc()
on the target data and preproc_data()
on the #' training data. SMOTE is applied using get_smote_data()
and get_smote_coords()
which both call resolve_class_imbalance()
.
Tasks. Tasks are obtained using mlr::makeClassifTask()
.
Learners. Learners are constructed using get_learners()
or get_final_learners()
.
Compute benchmark. The benchmark is run with compute_final_model()
or compute_benchmark()
(which needs to retrieve the outer #' folds of the nested resampling with get_outers()
).