Computational Models of Human Document Keyword Selection

Michael Stipicevic, Rensselaer Polytechnic Institute
Vladislav Veksler, Rensselaer Polytechnic Institute
Wayne Gray, Rensselaer Polytechnic Institute

Abstract

Computational models are presented that attempt to mimic how humans select keywords to describe documents. These semantic models are based on data mining techniques applied to large corpora of human writing. A methodology to test the merit of these models is developed; performance at matching author-chosen keywords is the basis of this test. Results indicate topic models and their derivatives outperform traditional semantic models. Finally, it is shown how these models might be incorporated into a system that automatically selects keywords for an academic publication.

The Paper: Computational Models of Human Document Keyword Selection

Back to Table of Contents