Create training data for a multilabel classification

make_trainingData(
  validationHumanReadingDTM,
  humanReadingTrainingLabels,
  webscrapped_validationDTM,
  webscrapped_trainingLabels,
  scale_type = "location",
  aggregate_labels = FALSE
)

Arguments

validationHumanReadingDTM

document-term matrix from human reading

humanReadingTrainingLabels

labels from human-reading

webscrapped_validationDTM

document-term matrix from webscrapping

webscrapped_trainingLabels

labels from webscrapping

scale_type

One of "location", "spatial", "temporal", default to "location"

aggregate_labels

logical, for temporal scale, option to aggregate into three larger classes

Value

a data.frame with nrow == nrow(validationHumanReading) + nrow(webscrapped_validationDTM) and ncol == ncol(validationHumanReadingDTM) + ncol(humanReadingTrainingLabels)