论文标题
精致的深度学习和片上光学衍射张量处理
Sophisticated deep learning with on-chip optical diffractive tensor processing
论文作者
论文摘要
不断增长的深度学习技术正在为现代生活做出革命性的改变。但是,传统的计算体系结构旨在处理顺序和数字程序,对执行巨大的平行和自适应深度学习应用的负担非常负担。光子集成电路提供了一种有效的方法来减轻带宽的限制和电子对应物带来的功率壁,从而在超快和无势能高性能计算中显示出巨大的潜力。在这里,我们提出了一个通过芯片衍射来实现卷积加速度,称为光卷积单元(OCU)。我们证明,OCU可以通过结构重新参数化的概念来促进任何实现的卷积内核。以OCU为基本单元,我们建立了一个光学卷积神经网络(OCNN),以实施两个流行的深度学习任务:分类和回归。为了进行分类,对时尚持有者和CIFAR-4数据集的精度分别为91.63%和86.25%。为了进行回归,我们构建了一个光学denoising卷积神经网络(ODNCNN),以在灰度水平σ= 10、15、20的灰度图像中处理高斯噪声,其平均PSNR分别为31.70dB,29.39dB和27.72dB。拟议的OCU由于其完全被动的性质和紧凑的占地面积而呈现出低能消耗和高信息密度的出色性能,为未来的计算体系结构提供了高度平行的解决方案,以处理深度学习中的高维张量。
The ever-growing deep learning technologies are making revolutionary changes for modern life. However, conventional computing architectures are designed to process sequential and digital programs, being extremely burdened with performing massive parallel and adaptive deep learning applications. Photonic integrated circuits provide an efficient approach to mitigate bandwidth limitations and power-wall brought by its electronic counterparts, showing great potential in ultrafast and energy-free high-performance computing. Here, we propose an optical computing architecture enabled by on-chip diffraction to implement convolutional acceleration, termed optical convolution unit (OCU). We demonstrate that any real-valued convolution kernels can be exploited by OCU with a prominent computational throughput boosting via the concept of structral re-parameterization. With OCU as the fundamental unit, we build an optical convolutional neural network (oCNN) to implement two popular deep learning tasks: classification and regression. For classification, Fashion-MNIST and CIFAR-4 datasets are tested with accuracy of 91.63% and 86.25%, respectively. For regression, we build an optical denoising convolutional neural network (oDnCNN) to handle Gaussian noise in gray scale images with noise level σ = 10, 15, 20, resulting clean images with average PSNR of 31.70dB, 29.39dB and 27.72dB, respectively. The proposed OCU presents remarkable performance of low energy consumption and high information density due to its fully passive nature and compact footprint, providing a highly parallel while lightweight solution for future computing architecture to handle high dimensional tensors in deep learning.