多语言文本识别的任务分组

论文标题

多语言文本识别的任务分组

Task Grouping for Multilingual Text Recognition

论文作者

Huang, Jing, Liang, Kevin J, Kovvuri, Rama, Hassner, Tal

论文摘要

由于英语和数字的受欢迎程度及其相应的数据集，大多数现有的OCR方法都集中在字母数字字符上。在将角色扩展到更多语言时，最近的方法表明，与将来自所有语言的字符组合在相同识别头中的字符相比，具有不同识别头的不同脚本可以大大提高端到端识别精度。但是，我们假设某些语言之间的相似性可以允许共享模型参数并受益于联合培训。但是，确定语言分组并不明显。为此，我们提出了一种使用Gumbel-SoftMax的任务分组和分配模块的自动方法，用于使用Gumbel-SoftMax，引入任务分组损失和加权识别损失，以允许对模型进行同时培训和分组模块。关于MLT19的实验为我们的假设提供了证据，即将每个任务组合在一起和分开实现任务分组/分离更好配置的每个任务之间存在一个中间立场。

Most existing OCR methods focus on alphanumeric characters due to the popularity of English and numbers, as well as their corresponding datasets. On extending the characters to more languages, recent methods have shown that training different scripts with different recognition heads can greatly improve the end-to-end recognition accuracy compared to combining characters from all languages in the same recognition head. However, we postulate that similarities between some languages could allow sharing of model parameters and benefit from joint training. Determining language groupings, however, is not immediately obvious. To this end, we propose an automatic method for multilingual text recognition with a task grouping and assignment module using Gumbel-Softmax, introducing a task grouping loss and weighted recognition loss to allow for simultaneous training of the models and grouping modules. Experiments on MLT19 lend evidence to our hypothesis that there is a middle ground between combining every task together and separating every task that achieves a better configuration of task grouping/separation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题