bootstrapping.rmd
SSCT_data
is loaded with the package. Here is what 10 random rows of it looks like:
SiteID | POINT_Y | POINT_X | ward |
---|---|---|---|
NCC_PH_5_13789 | 38.612 | -122.777 | 3 |
SAC_LSR_42 | 41.061 | -122.361 | 7 |
SCC_CH_2_37655 | 36.073 | -121.600 | 10 |
SAC_RGW_43 | 39.421 | -121.424 | 1 |
SFE_2018_1324 | 39.778 | -123.754 | 1 |
NC_UH_5_2734 | 40.903 | -124.028 | 10 |
SAC_LSR_517WE0515 | 39.414 | -120.980 | 1 |
SAC_LSR_526CE0483 | 41.361 | -120.299 | 3 |
SAC_RGW_9 | 38.938 | -122.341 | 1 |
SFE_2018_277 | 39.731 | -123.648 | 5 |
We extract some geology data from CA_geology
and SSCT_data
.
SSCT_labelled_points <- sp::SpatialPoints(
cbind(SSCT_data$POINT_X, SSCT_data$POINT_Y),
proj4string = sp::CRS('+proj=longlat +datum=WGS84')
)
SSCT_labelled_points <- sp::spTransform(SSCT_labelled_points, crs(CA_geology))
SSCT_data <- SSCT_data %>% dplyr::mutate(
geology = (SSCT_labelled_points %over% CA_geology) %>% dplyr::pull(GENERAL)
)
head(SSCT_data)
#> SiteID POINT_Y POINT_X ward geology
#> 1 K_CH_1_00458 41.93455 -123.7270 8 limestone
#> 2 K_CH_1_00493 41.92985 -123.7359 5 limestone
#> 3 K_CH_1_11296 41.71748 -123.8991 8 limestone
#> 4 K_CH_1_1546 41.69550 -124.0925 5 sedimentary
#> 5 K_CH_1_2329 41.84913 -123.0666 9 metasediment
#> 6 K_CH_1_25119 41.20531 -123.2234 10 sandstone
We perform the bootstrapping using bootstrap_freq()
. We use set.seed()
to ensure reproducibility of the random process.
set.seed(1789)
l_boot <- bootstrap_freq(SSCT_data %>% dplyr::select(c("ward", "geology")) %>% na.omit())
plotly::ggplotly(
plot_bootstrap_pvalue(l_boot),
tooltip = c("channel_type", "geology", "p.value")
)
plotly::ggplotly(
plot_bootstrap_probability(l_boot),
tooltip = c("channel_type", "number_groupings", "probability")
)