JOURNAL ARTICLE

Memory Tokens: Large Language Models Can Generate Reversible Sentence Embeddings

Abstract

In this work, we observe an interesting phenomenon: it is possible to generate reversible sentence embeddings that allow an LLM to reconstruct the original text exactly, without modifying the model's weights. This is achieved by introducing a special memory token, whose embedding is optimized through training on a fixed sequence. When prompted with this embedding, the model reconstructs the fixed sequence exactly. We evaluate this phenomenon across English and Spanish datasets, sequences of up to approximately 240 tokens, and model scales ranging from 100M to 8B parameters. Notably, Llama 3.1 8B successfully reconstructs all tested sequences. Our findings highlight an interesting capability of LLMs and suggest potential applications in memory-based retrieval, compression, and controlled text generation.

Keywords:
Computer science Natural language processing Sentence Language model Artificial intelligence

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.12
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Text Readability and Simplification
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Tokens, embeddings and prompts: welcome to the world of large language models

Poulain, Pierre

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2024
JOURNAL ARTICLE

Tokens, embeddings and prompts: welcome to the world of large language models

Poulain, Pierre

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2024
JOURNAL ARTICLE

Tokens, embeddings and prompts: welcome to the world of large language models

Poulain, Pierre

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2024
BOOK-CHAPTER

From Sentence Embeddings to Large Language Models to Detect and Understand Wordplay

Ryan Rony Dsilva

Lecture notes in computer science Year: 2024 Pages: 205-214
© 2026 ScienceGate Book Chapters — All rights reserved.