Named-Entity Recognition on Indonesian Tweets using Bidirectional LSTM-CRF

Deni Cahya Wintaka; Moch Arif Bijaksana; Ibnu Asror

doi:10.1016/j.procs.2019.08.161

ScienceGate Book Chapters

JOURNAL ARTICLE

Named-Entity Recognition on Indonesian Tweets using Bidirectional LSTM-CRF

Deni Cahya Wintaka Moch Arif Bijaksana Ibnu Asror

Year: 2019 Journal: Procedia Computer Science Vol: 157 Pages: 221-228 Publisher: Elsevier BV

DOI: 10.1016/j.procs.2019.08.161

Get Full-Text PDF Get Analytical Report

Abstract

The massive amount of Twitter data allow it to be analyzed using Named-Entity Recognition. Named-Entity Recognition (NER) is a sub-task of Information Extraction that can recognize entities in a text. Most NERs are trained to handle formal text such as news articles, but when applied to informal texts such as tweets, it provides poor performance. The limited number of words, informal and messy grammar on tweets makes it difficult to classify the entities needed. In this study, it was built the model using a combination of deep learning and machine learning approaches, Bidirectional Long Short-Term Memory (BLSTM) and Conditional Random Field (CRF) as the solutions. Entities identified in the form of Person, Location and Organization. The corpus tested included 600 Indonesian tweets comprising 250 formal tweets and 350 informal tweets. The model got the best F1 score results by adding the word embedding type FastText, which are 86,13% for formal tweets, 81,17% for informal tweets, and 84,11% for combined tweets.

Keywords:

Computer science Conditional random field Named-entity recognition Natural language processing Artificial intelligence Task (project management) Word (group theory) Field (mathematics) F1 score Information retrieval Linguistics

Metrics

Cited By

2.77

FWCI (Field Weighted Citation Impact)

Refs

0.92

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Advanced Text Analysis Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Named-Entity Recognition on Indonesian Tweets using Bidirectional LSTM-CRF

Abstract

Metrics

Citation History

Topics

Related Documents

Bidirectional LSTM-CRF for biomedical named entity recognition

Named Entity Recognition on Indonesian Online News Based on Bidirectional LSTM-CRF

Portuguese Named Entity Recognition Using LSTM-CRF

Named-Entity Recognition for Indonesian Language using Bidirectional LSTM-CNNs

LSTM-CRF Models for Named Entity Recognition