论文标题
无监督的分布单词嵌入
Unsupervised Alignment of Distributional Word Embeddings
论文作者
论文摘要
跨域对齐在从机器翻译到转移学习的任务中起关键作用。最近,在单语嵌入中运行的纯监督方法已成功地用于推断双语词典而不依赖监督。但是,当前的最新方法仅关注点向量,尽管事实证明,在表示单词时,分布嵌入了分布嵌入可以嵌入更丰富的语义信息。在本文中,我们提出了与概率嵌入对齐的随机优化方法。最后,我们通过对齐单语言数据训练的单词嵌入方式来评估无监督单词翻译问题的方法。我们表明,所提出的方法在几种语言对的双语词典诱导任务上取得了良好的表现,并且比基于点矢量的方法更好。
Cross-domain alignment play a key roles in tasks ranging from machine translation to transfer learning. Recently, purely unsupervised methods operating on monolingual embeddings have successfully been used to infer a bilingual lexicon without relying on supervision. However, current state-of-the art methods only focus on point vectors although distributional embeddings have proven to embed richer semantic information when representing words. In this paper, we propose stochastic optimization approach for aligning probabilistic embeddings. Finally, we evaluate our method on the problem of unsupervised word translation, by aligning word embeddings trained on monolingual data. We show that the proposed approach achieves good performance on the bilingual lexicon induction task across several language pairs and performs better than the point-vector based approach.