JOURNAL ARTICLE

MFinBERT: Multilingual Pretrained Language Model For Financial Domain

Abstract

There has been an increasing demand for good semantic representations of text in the financial sector when solving natural language processing tasks in Fintech. Previous work has shown that widely used modern language models trained in the general domain often perform poorly in this particular domain. There have been attempts to overcome this limitation by introducing domain-specific language models learned from financial text. However, these approaches suffer from the lack of in-domain data, which is further exacerbated for languages other than English. These problems motivate us to develop a simple and efficient pipeline to extract large amounts of financial text from large-scale multilingual corpora such as OSCAR and C4. We conduct extensive experiments with various downstream tasks in three different languages to demonstrate the effectiveness of our approach across a wide range of standard benchmarks.

Keywords:
Computer science Pipeline (software) Natural language processing Domain (mathematical analysis) Artificial intelligence Language model Scale (ratio) Natural language Range (aeronautics) Simple (philosophy) Programming language

Metrics

2
Cited By
0.39
FWCI (Field Weighted Citation Impact)
31
Refs
0.62
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Stock Market Forecasting Methods
Social Sciences →  Decision Sciences →  Management Science and Operations Research
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Pretrained multilingual Party model

Benjamin Kiessling

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2025
JOURNAL ARTICLE

Pretrained multilingual Party model

Benjamin Kiessling

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2025
JOURNAL ARTICLE

Distilling a Pretrained Language Model to a Multilingual ASR Model

Kwanghee ChoiHyung‐Min Park

Journal:   Interspeech 2022 Year: 2022 Pages: 2203-2207
JOURNAL ARTICLE

Pretrained multilingual Party model

Benjamin Kiessling

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2025
© 2026 ScienceGate Book Chapters — All rights reserved.