JOURNAL ARTICLE

Named-Entity Recognition on Indonesian Tweets using Bidirectional LSTM-CRF

Deni Cahya WintakaMoch Arif BijaksanaIbnu Asror

Year: 2019 Journal:   Procedia Computer Science Vol: 157 Pages: 221-228   Publisher: Elsevier BV

Abstract

The massive amount of Twitter data allow it to be analyzed using Named-Entity Recognition. Named-Entity Recognition (NER) is a sub-task of Information Extraction that can recognize entities in a text. Most NERs are trained to handle formal text such as news articles, but when applied to informal texts such as tweets, it provides poor performance. The limited number of words, informal and messy grammar on tweets makes it difficult to classify the entities needed. In this study, it was built the model using a combination of deep learning and machine learning approaches, Bidirectional Long Short-Term Memory (BLSTM) and Conditional Random Field (CRF) as the solutions. Entities identified in the form of Person, Location and Organization. The corpus tested included 600 Indonesian tweets comprising 250 formal tweets and 350 informal tweets. The model got the best F1 score results by adding the word embedding type FastText, which are 86,13% for formal tweets, 81,17% for informal tweets, and 84,11% for combined tweets.

Keywords:
Computer science Conditional random field Named-entity recognition Natural language processing Artificial intelligence Task (project management) Word (group theory) Field (mathematics) F1 score Information retrieval Linguistics

Metrics

41
Cited By
2.77
FWCI (Field Weighted Citation Impact)
13
Refs
0.92
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.