Jiyoung YoonMuhammad JunaidSajid AliJongwuk Lee
AI technology in the legal domain has developed at a rapid pace around the world, but not much research is being conducted in the Korean legal field due to barriers of language and the high level of expertise required. We first attempt abstractive summarization of Korean legal decision text and publicly release our collected dataset. We utilize two pretrained language models, i.e., BERT2BERT and BART, for our task. They are based on the encoder-decoder approach under transformer architecture. While BERT2BERT is pre-trained with BERT on both the encoder and decoder, BART combines BERT and GPT as the encoder and the decoder. We then evaluate the baseline models and show that, despite the difference in language style, the high-quality summary was generated using applied models. We also show that pre-training using both autoencoder and autoregressive method makes better performance than using solely denoising autoencoder.
Hamada M. ZaheraFedor VitiuginMohamed Ahmed SherifCarlos CastilloAxel-Cyrille Ngonga Ngomo
Tiezheng YuWenliang DaiZihan LiuPascale Fung
Tohida RehmanSuchandan DasDebarshi Kumar SanyalSamiran Chattopadhyay
Rini WijayantiMasayu Leylia KhodraDwi H. Widyantoro