论文标题

Meg-Masc:用于评估自然语音处理的高质量磁脑线图数据集

MEG-MASC: a high-quality magneto-encephalography dataset for evaluating natural speech processing

论文作者

Gwilliams, Laura, Flick, Graham, Marantz, Alec, Pylkkanen, Liina, Poeppel, David, King, Jean-Remi

论文摘要

“ Meg-Masc”数据集提供了一组策划的原始磁脑摄影(MEG)录音(MEG),记录了27位英语的人,他们听了两个小时的自然主义故事。每个参与者进行了两次相同的会议,涉及从手动注释的子库(MASC)中聆听四个虚构的故事,这些故事与随机单词列表和理解问题相结合。我们为记录的元数据中每个单词和音素的开始时间和偏移时间,并根据“脑成像数据结构”(BIDS)组织数据集。该数据收集为大规模编码和解码分析的大脑对语音的反应提供了合适的基准。我们提供Python代码,以复制MEG引起的相关字段的几个验证分析,例如语音特征的时间解码和单词频率。所有代码和MEG,音频和文本数据都可以公开使用,以遵守透明和可重复的研究的最佳实践。

The "MEG-MASC" dataset provides a curated set of raw magnetoencephalography (MEG) recordings of 27 English speakers who listened to two hours of naturalistic stories. Each participant performed two identical sessions, involving listening to four fictional stories from the Manually Annotated Sub-Corpus (MASC) intermixed with random word lists and comprehension questions. We time-stamp the onset and offset of each word and phoneme in the metadata of the recording, and organize the dataset according to the 'Brain Imaging Data Structure' (BIDS). This data collection provides a suitable benchmark to large-scale encoding and decoding analyses of temporally-resolved brain responses to speech. We provide the Python code to replicate several validations analyses of the MEG evoked related fields such as the temporal decoding of phonetic features and word frequency. All code and MEG, audio and text data are publicly available to keep with best practices in transparent and reproducible research.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源