Due to presence of large amounts of digital data, many tools for information extraction were developed in order to provide meaningful information and knowledge that could be used in text analysis and interpretation. Machine learning, artificial intelligence and data mining can help there a lot. In this paper, program for extracting address entities is presented as task of named entity recognition. The dataset for named entity recognition are USA addresses that are labeled as one of 8 labels. The model is trained in Python with Tensorflow using pretrained word vectors taken from GloVe-Global vector word embedding. The algorithm that is used is long short-term memory (LSTM) which is special type of recurrent neural network. It was very useful for this application since it takes care of context of the input data. By using this algorithm, model was able to learn how later entities are related to previous ones and thus resolve some complex examples such as differentiating between city and state with the same name.
Jugal KrishnaB. TejaswiniN.S. Vishnu PriyaBK Nayak
Deepali NagraleVaibhav KhatavkarParag Kulkarni
Guoyu WangYongquan CaiFujiang Ge
Jerzy KocerkaMichał KrześlakAdam Gałuszka