BOOK-CHAPTER

Assessing Natural Language Processing

Yvette Graham

Year: 2021 Educational research and innovation   Publisher: Organization for Economic Cooperation and Development

Abstract

This chapter details evaluation techniques in Natural Language Processing, a challenging sub-discipline of artificial intelligence (AI). It highlights proven methods to provide both fair and replicable results for evaluation of system performance, as well as methods of longitudinal evaluation and comparison with human performance. It recaps pitfalls to avoid in applying techniques to new areas. In addition to direct measurement and comparison of system and human performance for individual tasks, the chapter reflects on the degree of shared human-machine task, scalability and potential for malicious application. Finally, it discusses the applicability of human intelligence tests to AI systems and summarises considerations for devising a general framework for assessing AI and robotics.

Keywords:
Computer science Artificial intelligence Task (project management) Scalability Robotics Natural (archaeology) Natural language Human–computer interaction Machine learning Robot Engineering Systems engineering

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
492
Refs
0.25
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Explainable Artificial Intelligence (XAI)
Physical Sciences →  Computer Science →  Artificial Intelligence
Adversarial Robustness in Machine Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
Ethics and Social Impacts of AI
Social Sciences →  Social Sciences →  Safety Research
© 2026 ScienceGate Book Chapters — All rights reserved.