Purpose

This vignette exposes the code used to retrieve the vocabulary from the corpus

Functions

token_counts = function(x){
    return (length(strsplit(x, " ")))
}
split = function(x) {
    return (strsplit(x, " "))
}