regional_benchmark.Rdregional_benchmark() is a wrapper function calling a number of functions. See details.
regional_benchmark( regions = c("ALLSAC", "SFE", "K", "NC", "NCC", "SCC", "SC", "SJT"), LRN_IDS, TUNELENGTH, INNER, ITERS, PROB, NU, REPS, PREPROC, FINAL, PATH, REDUCED, MES, INFO, FS, FS_NUM )
| regions |
|
|---|---|
| LRN_IDS |
|
| TUNELENGTH |
|
| INNER |
|
| ITERS |
|
| PROB |
|
| NU |
|
| REPS |
|
| PREPROC |
|
| FINAL |
|
| PATH |
|
| REDUCED |
|
| MES |
|
| INFO |
|
| FS |
|
| FS_NUM |
|
a list of mlr benchmark results
Here is a some pseudo-code that explains what is happening behind the scenes.
Skip. Because regional_benchmark() is called inside the for-loop for (FS_NUM in FS_NUM_LIST) (see above), if FINAL is #' TRUE, regional_benchmark() skips the region it does not need to calculate the final models.
Data loading. This handled by get_training_data()
Data formatting. This is handled by fmt_labels(), sanitize_data() and get_coords().
Feature selection. If FINAL is TRUE the selected features are retrieved from get_bestBMR_tuning_results(). If FINAL is FALSE the selected features are derived from transformed training data using get_ppc() and preproc_data(). The resulting transformed data are filtered for correlation higher than 0.95 with caret::findCorrelation(). Then, 500 subsampled mlr Tasks are created with mlr::makeResampleDesc(), mlr::makeClassifTask(), mlr::makeResampleInstance() and mlr::filterFeatures(). The #' FS_NUM most commonly select features across the 500 realizations are selected.
Pre-processing. The target and training data are transformed using get_ppc() on the target data and preproc_data() on the #' training data. SMOTE is applied using get_smote_data() and get_smote_coords() which both call resolve_class_imbalance().
Tasks. Tasks are obtained using mlr::makeClassifTask().
Learners. Learners are constructed using get_learners() or get_final_learners().
Compute benchmark. The benchmark is run with compute_final_model() or compute_benchmark() (which needs to retrieve the outer #' folds of the nested resampling with get_outers()).