词典和基于注意力的手写文本识别系统

论文标题

词典和基于注意力的手写文本识别系统

Lexicon and Attention based Handwritten Text Recognition System

论文作者

Kumari, Lalita, Singh, Sukhdeep, Rathore, VVS, Sharma, Anuj

论文摘要

手写文本识别问题是由计算机视觉社区的研究人员广泛研究的，因为它的改进和适用于日常生活的范围，它是模式识别的子域。由于过去几十年以来，基于神经网络的系统的计算能力提高了计算能力，因此有助于提供最新的手写文本识别器。在同一方向上，我们采用了两个最先进的神经网络系统，并将注意力机制融合在一起。注意技术已被广泛用于神经机器翻译和自动语音识别的领域，现在正在文本识别域中实现。在这项研究中，我们能够在IAM数据集上达到4.15％的字符错误率和9.72％的单词错误率，7.07％的字符错误率和GW数据集的16.14％的单词错误率与现有的Flor等人合并后，GW数据集的单词错误率。建筑学。为了进一步分析，我们还使用了类似于Shi等人的系统。具有贪婪解码器的神经网络系统，观察到基本模型的字符错误率提高了23.27％。

The handwritten text recognition problem is widely studied by the researchers of computer vision community due to its scope of improvement and applicability to daily lives, It is a sub-domain of pattern recognition. Due to advancement of computational power of computers since last few decades neural networks based systems heavily contributed towards providing the state-of-the-art handwritten text recognizers. In the same direction, we have taken two state-of-the art neural networks systems and merged the attention mechanism with it. The attention technique has been widely used in the domain of neural machine translations and automatic speech recognition and now is being implemented in text recognition domain. In this study, we are able to achieve 4.15% character error rate and 9.72% word error rate on IAM dataset, 7.07% character error rate and 16.14% word error rate on GW dataset after merging the attention and word beam search decoder with existing Flor et al. architecture. To analyse further, we have also used system similar to Shi et al. neural network system with greedy decoder and observed 23.27% improvement in character error rate from the base model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题