论文标题

基于离散小波变换和生成对抗网络的颜色文档图像的三阶段二聚体化

Three-stage binarization of color document images based on discrete wavelet transform and generative adversarial networks

论文作者

Ju, Rui-Yang, Lin, Yu-Shian, Jin, Yanlin, Chen, Chih-Chia, Chien, Chun-Tse, Chiang, Jen-Shiun

论文摘要

从降级的彩色文档图像中,有效地从背景中提取文本信息是保存古代手稿的重要挑战。随着时间的流逝,古老的手稿的不完美保存导致了不同类型的退化,例如淡黄色,染色和墨水出血,严重影响了文档图像二进制的结果。这项工作提出了一种有效的三阶段网络方法,以使用生成对抗网络(GAN)(GANS)对降级文档进行降解和二进制。具体而言,在阶段1中,我们首先将输入图像分为多个补丁,然后将这些贴片分为四个单渠道补丁图像(灰色,红色,绿色和蓝色)。然后,通过归一化的离散小波变换(DWT)处理了三个单通道贴片图像(红色,绿色和蓝色)。在第2阶段,我们使用四个独立发电机根据处理的补丁图像上的四个通道分别训练GAN模型,以提取颜色的前景信息。最后,在第3阶段中,我们将两个独立的GAN模型训练在阶段2的输出上,并作为局部和全局预测,以获得最终输出。实验结果表明,该方法的AVG得分指标为77.64、77.95、79.05、76.38、75.34和77.00,在(H)-Dibco 2011、2013、2013、2013、2014、2014、2016、2016、2016、2017、2017和2018数据集中,该数据集处于状态的状态。该工作的实现代码可在https://github.com/abcpp12383/threestagebinarization上获得。

The efficient extraction of text information from the background in degraded color document images is an important challenge in the preservation of ancient manuscripts. The imperfect preservation of ancient manuscripts has led to different types of degradation over time, such as page yellowing, staining, and ink bleeding, seriously affecting the results of document image binarization. This work proposes an effective three-stage network method to image enhancement and binarization of degraded documents using generative adversarial networks (GANs). Specifically, in Stage-1, we first split the input images into multiple patches, and then split these patches into four single-channel patch images (gray, red, green, and blue). Then, three single-channel patch images (red, green, and blue) are processed by the discrete wavelet transform (DWT) with normalization. In Stage-2, we use four independent generators to separately train GAN models based on the four channels on the processed patch images to extract color foreground information. Finally, in Stage-3, we train two independent GAN models on the outputs of Stage-2 and the resized original input images (512x512) as the local and global predictions to obtain the final outputs. The experimental results show that the Avg-Score metrics of the proposed method are 77.64, 77.95, 79.05, 76.38, 75.34, and 77.00 on the (H)-DIBCO 2011, 2013, 2014, 2016, 2017, and 2018 datasets, which are at the state-of-the-art level. The implementation code for this work is available at https://github.com/abcpp12383/ThreeStageBinarization.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源