Pretrained Neural Models for Turkish Text Classification

HALİL İBRAHİM OKUR; Ahmet Sertbaş

doi:10.1109/ubmk52708.2021.9558878

ScienceGate Book Chapters

JOURNAL ARTICLE

Pretrained Neural Models for Turkish Text Classification

HALİL İBRAHİM OKUR Ahmet Sertbaş

Year: 2021 Journal: 2021 6th International Conference on Computer Science and Engineering (UBMK) Pages: 174-179

DOI: 10.1109/ubmk52708.2021.9558878

Get Full-Text PDF Get Analytical Report

Abstract

In the text classification process, which is a sub-task of NLP, the preprocessing and indexing of the text has a direct determining effect on the performance for NLP models. When the studies on pre-trained models are examined, it is seen that the changes made on the models developed for world languages or training the same model with a Turkish text dataset. Word-embedding is considered to be the most critical point of the text processing problem. The two most popular word embedding methods today are Word2Vec and Glove, which embed words into a corpus using multidimensional vectors. BERT, Electra and Fastext models, which have a contextual word representation method and a deep neural network architecture, have been frequently used in the creation of pre-trained models recently. In this study, the use and performance results of pre-trained models on TTC-3600 and TRT-Haber text sets prepared for Turkish text classification NLP task are shown. By using pre-trained models obtained with large corpus, a certain time and hardware cost, the text classification process is performed with less effort and high performance.

Keywords:

Computer science Word2vec Artificial intelligence Turkish Preprocessor Natural language processing Word embedding Task (project management) Word (group theory) Search engine indexing Embedding Text processing Artificial neural network Process (computing) Linguistics

Metrics

Cited By

0.49

FWCI (Field Weighted Citation Impact)

Refs

0.66

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Text and Document Classification Technologies

Physical Sciences → Computer Science → Artificial Intelligence

Pretrained Neural Models for Turkish Text Classification

Abstract

Metrics

Citation History

Topics

Related Documents

Combining Pretrained and Graph Models for Text Classification

Turkish abstractive text summarization using pretrained sequence-to-sequence models

A study of Turkish emotion classification with pretrained language models

Footwear Classification Using Pretrained CNN Models with Deep Neural Network

High Utility Pattern Fusion by Pretrained Language Models for Text Classification