BOOK-CHAPTER

PYTHON-BASED IMAGE CAPTIONING USING CNN AND LSTM

Abstract

Our most important sense is vision. The capacity to see has been used by software engineers to create more dynamic, intelligent, and easily accessible software through visuals. There are circumstances, nevertheless, in which an image might not be enough. Alternative text may be provided to avoid bandwidth limits and offer a more accessible experience if further context is required. The manual explanation falls short in an age where there is simply a great deal of photographs to describe. Deep learning can integrate image processing and natural language processing, allowing computers to independently provide explanations for images. This service can be provided through a user-friendly web interface, where customers can simply upload the photos, they want to be described. This makes it possible for anybody to easily take advantage of the capabilities of this deep learning technique and adaptive image descriptor capability through a simple API, with the computationally demanding chores being abstracted away.

Keywords:
Closed captioning Computer science Upload Python (programming language) Software Artificial intelligence Multimedia Human–computer interaction Image (mathematics) Computer vision World Wide Web Programming language

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.28
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

COVID-19 diagnosis using AI
Health Sciences →  Medicine →  Radiology, Nuclear Medicine and Imaging
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.