论文标题

端到端学习语音识别的低级生理含义

Low-Level Physiological Implications of End-to-End Learning of Speech Recognition

论文作者

de Gibson, Louise Coppieters, Garner, Philip N.

论文摘要

从机器学习的角度来看,当前的语音识别体系结构的表现非常出色,因此用户互动。这表明他们很好地模拟了人类生物系统。我们调查是否可以颠倒该推论以提供对该生物系统的见解。特别是听力机制。使用SINCNET,我们确认端到端系统确实学习了众所周知的滤纸结构。但是,我们还表明,在学习的结构中,更宽的带宽过滤器很重要。虽然可以通过初始化狭窄和宽带过滤器来获得一些好处,但生理上的限制可能表明,这种过滤器是在中脑而不是耳蜗中出现的。我们表明,必须修改标准的机器学习体系结构,以使该过程被神经仿真。

Current speech recognition architectures perform very well from the point of view of machine learning, hence user interaction. This suggests that they are emulating the human biological system well. We investigate whether the inference can be inverted to provide insights into that biological system; in particular the hearing mechanism. Using SincNet, we confirm that end-to-end systems do learn well known filterbank structures. However, we also show that wider band-width filters are important in the learned structure. Whilst some benefits can be gained by initialising both narrow and wide-band filters, physiological constraints suggest that such filters arise in mid-brain rather than the cochlea. We show that standard machine learning architectures must be modified to allow this process to be emulated neurally.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源