Create training data for a multiclass classification

make_trainingDataMulticlass(
  trainingData,
  validationHumanReadingDTM,
  humanReadingTrainingLabels,
  webscrapped_validationDTM,
  webscrapped_trainingLabels,
  filter = FALSE,
  addWebscrapped = FALSE,
  obs_threshold = 10,
  filterIrrelevant = TRUE,
  addTopicDocs = FALSE,
  validationTopicDocs = NULL
)

Arguments

trainingData

the already aggreated validation data

validationHumanReadingDTM

document-term matrix from human reading

humanReadingTrainingLabels

labels from human-reading

webscrapped_validationDTM

document-term matrix from webscrapping

webscrapped_trainingLabels

labels from webscrapping

filter

logical, if true create training data for a binary classification Irrelevant/Relevant

Value

a data.frame containing the training data with a target column "countryLabelFilter" or "countryLabel"