Named Entity Recognition (NER) is an important task in Natural Language Processing with applications in many domains. While the dominant paradigm of NER is sequence labelling, span-based approaches have become very popular in recent times but are less well understood. In this work, we study different aspects of span-based NER, namely the span representation, learning strategy, and decoding algorithms to avoid span overlap. We also propose an exact algorithm that efficiently finds the set of non-overlapping spans that maximizes a global score, given a list of candidate spans. We performed our study on three benchmark NER datasets from different domains. We make our code publicly available at https://github.com/urchade/span-structured-prediction.
Jinlan FuXuanjing HuangPengfei Liu
Urchade ZaratianaNiama ElkhbirPierre HolatNadi TomehThierry Charnois
Jianfeng DengR. P. ZhaoWei YeShuai Zheng
Yoann DupontMarco DinarelliIsabelle TellierChristian Lautier