Learning Word Embeddings using Lexical Resources and Corpora

Stanković, Ranka; Rađenović, Jovana; Škorić, Mihailo; Putnikovic, Marko

doi:10.5281/zenodo.15093900

ScienceGate Book Chapters

JOURNAL ARTICLE

Learning Word Embeddings using Lexical Resources and Corpora

Stanković, Ranka Rađenović, Jovana Škorić, Mihailo Putnikovic, Marko

Year: 2025 Journal: Zenodo (CERN European Organization for Nuclear Research) Publisher: European Organization for Nuclear Research

DOI: 10.5281/zenodo.15093900

Get Full-Text PDF Get Analytical Report

Abstract

Learning word embeddings on large unlabeled corpora has proven effective for many natural language tasks,. However, these representations can be further improved by incorporating external lexical resources. Previous research has demonstrated that lexical vector representation (embeddings; e.g. dic2vec) trained on both text and lexical data (e.g., WordNet and/or monolingual dictionaries) give improved results for English. Many Serbian Wordnet and Serbian electronic dictionaries present on the Web enable testing this approach for Serbian within this project. In this paper, we adapt the original dict2vec project for Serbian language resources. We present the textual, lexical, and vector resources prepared and used for training and evaluation, describe the training pipeline and discuss preliminary evaluation results. We conclude this paper by outlining ongoing work and future steps.

Keywords:

Serbian WordNet Lexical database Word (group theory) Pipeline (software) Representation (politics) Natural language Lexical item

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.49

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Computational Drug Discovery Methods

Physical Sciences → Computer Science → Computational Theory and Mathematics

Cancer and biochemical research

Life Sciences → Biochemistry, Genetics and Molecular Biology → Molecular Biology

Pesticide Residue Analysis and Safety

Life Sciences → Agricultural and Biological Sciences → Food Science

Learning Word Embeddings using Lexical Resources and Corpora

Abstract

Metrics

Topics

Related Documents

Learning Word Embeddings using Lexical Resources and Corpora

Dict2vec : Learning Word Embeddings using Lexical Dictionaries

Lexical Comparison Between Wikipedia and Twitter Corpora by Using Word Embeddings

Learning Crosslingual Word Embeddings without Bilingual Corpora

Lexical Function Identification Using Word Embeddings and Deep Learning