S³ seminar : FastText: A library for efficient learning of word representations and sentence classification

Séminaire le 24 Février 2017, 10h30 à CentraleSupelec (Gif-sur-Yvette) Salle du conseil du L2S - B4.40
Piotr Bojanowski, (Facebook AI Research)

In this talk, I will describe FastText, an open-source library that can be used to train word representations or text classifiers. This library is based on our generalization of the famous word2vec model, allowing to adapt it easily to various applications. I will go over the formulation of the skipgram and cbow models of word2vec and how these were extended to meet the needs of our model. I will describe in details the two applications of our model, namely document classification and building morphologically-rich word representations. In both applications, our model achieves very competitive performance while being very simple and fast.