Arabic Image Captioning using Deep Learning with Attention

Sabri, Sabri Monaf

Arabic Image Captioning using Deep Learning with Attention

Sabri, Sabri Monaf

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

Automatic image captioning is a challenging deep learning task involving computer vision to understand the contents of an image and natural language generation to compose a coherent description for that image. Image captioning for the English language is well-developed and has high precision, with some recent work surpassing human-level performance. However, Arabic image captioning work has been lacking, with few papers published having relatively low-performance results. Researchers attribute this to the Arabic language's morphological complexity and the to lack of large, robust benchmark datasets compared to those available for the English language. Our proposed framework includes using an improved text preprocessing pipeline incorporating a word segmenter to alleviate some of the morphological complexity associated with the Arabic language. We also build neural network architectures which include techniques not previously explored in the Arabic image captioning literature, such as attention mechanisms and transformers. Our approach yields better results over the most recent published work on the subject in Arabic, improving the BLEU-1 score from 33 to 44.3 and the BLEU-4 score from 6 to 15.6.

Details

Record ID

4632

Record Created

2024-12-05

Title

Arabic Image Captioning using Deep Learning with Attention

Author

Sabri, Sabri Monaf

Contributor

Maier, Frederick Advisor
Rasheed, Khaled Committee Member
Li, Sheng Committee Member

College or School

Franklin College of Arts and Sciences

Department

Institute for Artificial Intelligence

Subjects

Artificial intelligence

Content Type

Thesis

Pagination

66

File Format

pdf

Language

English

Degree Type

Master of Science (MS)

Name of Granting Institution

University of Georgia

Year Degree Granted

2021-08

Keywords

ARABIC IMAGE CAPTIONING; ATTENTION; DEEP LEARNING; IMAGE CAPTIONING; TRANSFORMERS

Record Appears in

Electronic Theses and Dissertations > Graduate Thesis
Franklin College of Arts and Sciences
All Resources

System Control Number

9949391254602959

PDF

Statistics

Download Full History