论文标题
Nordiochange:挪威语的历时语义变化数据集
NorDiaChange: Diachronic Semantic Change Dataset for Norwegian
论文作者
论文摘要
我们描述了Nordiochange:挪威语的第一个历时语义变化数据集。 Nordiochange包括两个新的子集,涵盖了大约80个挪威名词,随着时间的流逝,语义变化的分级变化。这两个数据集都遵循相同的注释过程,并且可以互换用作火车和彼此测试拆分。 Nordiochange涵盖了与战前和战后事件,挪威的石油和气体发现以及技术发展有关的时间段。注释是使用Durel框架和两个大型历史挪威语料库完成的。 NordioChange在允许的许可下完整出版,并附有原始注释数据和推断的直觉单词用法图(DWUGS)。
We describe NorDiaChange: the first diachronic semantic change dataset for Norwegian. NorDiaChange comprises two novel subsets, covering about 80 Norwegian nouns manually annotated with graded semantic change over time. Both datasets follow the same annotation procedure and can be used interchangeably as train and test splits for each other. NorDiaChange covers the time periods related to pre- and post-war events, oil and gas discovery in Norway, and technological developments. The annotation was done using the DURel framework and two large historical Norwegian corpora. NorDiaChange is published in full under a permissive licence, complete with raw annotation data and inferred diachronic word usage graphs (DWUGs).