arxivst stuff from arxiv that you should probably bookmark

Neural Paraphrase Identification of Questions with Noisy Pretraining

Abstract · Apr 15, 2017 02:09 ·

embeddings character quora question dataset identification questions paraphrase forums cs-cl

Arxiv Abstract

  • Gaurav Singh Tomar
  • Thyago Duque
  • Oscar Täckström
  • Jakob Uszkoreit
  • Dipanjan Das

We present a solution to the problem of paraphrase identification of questions. We focus on a recent dataset of question pairs annotated with binary paraphrase labels and show that a variant of the decomposable attention model (Parikh et al., 2016) results in accurate performance on this task, while being far simpler than many competing neural architectures. Furthermore, when the model is pretrained on a noisy dataset of automatically collected question paraphrases, it obtains the best reported performance on the dataset.

Read the paper (pdf) »