arxivst stuff from arxiv that you should probably bookmark

Global Entity Ranking Across Multiple Languages

Abstract · Mar 17, 2017 17:16 ·

cs-ir cs-cl cs-si

Arxiv Abstract

  • Prantik Bhattacharyya
  • Nemanja Spasojevic

We present work on building a global long-tailed ranking of entities across multiple languages using Wikipedia and Freebase knowledge bases. We identify multiple features and build a model to rank entities using a ground-truth dataset of more than 10 thousand labels. The final system ranks 27 million entities with 75% precision and 48% F1 score. We provide performance evaluation and empirical evidence of the quality of ranking across languages, and open the final ranked lists for future research.

Read the paper (pdf) »