JOURNAL ARTICLE

Measuring Text-to-SQL Semantic Parsing Model on the Question Generalizability

Abstract

One of the challenges in NLP tasks, such as text-to-SQL semantic parsing, is generalization. In the text-to-SQL task, having separate training and testing data can measure one aspect of the generalization: how well the model generalizes to unseen databases. Other aspects, however, remain unaccounted for. We propose a new dataset and a more challenging and thorough evaluation process that focuses on the two challenges of generalizing the text-to-SQL model: database content references and question patterns. We create SPIDER-QG, an augmented dataset that employs three techniques, to assess generalizability. First, we replace the set of values in the existing test set with other values from the same column in the same database. Second, we use the synonym of each value as a replacement instead. Third, we generate new questions for the existing SQL query by back-translating the original question. Our evaluation setup demonstrates the generalization challenges and struggles of the current models.

Keywords:
Computer science Generalizability theory SQL Parsing Artificial intelligence Natural language processing Generalization Set (abstract data type) Information retrieval Programming language

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
4
Refs
0.18
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Software Engineering Research
Physical Sciences →  Computer Science →  Information Systems
© 2026 ScienceGate Book Chapters — All rights reserved.