Matching Results of Latent Dirichlet Allocation for Text

Abstract

Many approaches have been introduced to enable Latent Dirichlet Allocation (LDA) models to be updated in an online manner. This includes inferring new documents into the model, passing parameter priors to the inference algorithm or a mixture of both, leading to more complicated and computationally expensive models. We present a method to match and compare the resulting LDA topics of different models with light weight easy to use similarity measures. We address the on-line problem by keeping the model inference simple and matching topics solely by their high probability word lists.


Back to Table of Contents