In recent years, regular scene text recognition has made great progress, but irregular text recognition still has certain difficulties. Most current text recognition methods treat text detection and text recognition as two separate tasks. In order to better recognize irregular text, this paper proposes an end-to-end scene text recognition based on a Transformer model, which not only uses the attention mechanism to perform Decode, but also introduce a network for correcting pictures and a network structure that expands its model through a bidirectional decoder. In order to better evaluate the performance of this model, experiments are carried out on data sets such as SVT and ICDAR 2013. The experiments prove that the method in this paper relatively balances complexity and accuracy, and has obvious performance advantages.
Leena Mary FrancisK. C. VisalatchiN. Sreenath
Bayan M. AlbalawiAmani JamalLama Al KhuzayemOlaa A. Alsaedi
Yi ZhangZhiwen LiLei GuoWenbi Rao
Die LuoYU QiuhuiChengwan HeWei Chen