Corpus preparation

We used hunspell american english dictionary (en_US). The number of topic was fixed to \(k = 105\) for the English corpus. We used 1000 iterations with parameters \(\alpha = 0.5\) and \(\beta = 0.1\).