2025 : 4 : 21
Hassan Khotanlou

Hassan Khotanlou

Academic rank: Professor
ORCID:
Education: PhD.
ScopusId: 14015911600
HIndex:
Faculty: Faculty of Engineering
Address:
Phone:

Research

Title
Improving Image Captioning with Local Attention Mechanism
Type
Presentation
Keywords
component, deep learning, image captioning, attention mechanism, encoder-decoder
Year
2022
Researchers ، Hassan Khotanlou ،

Abstract

—Image caption generation is field of research between the fields of machine vision and natural language processing. Based on the results of evidence, it is a difficult for the machine to understand an image like a human. Most of the proposed methods in this field of automatic image description production follow the encoder-decoder framework. In these proposed methods, each word in step (n) is generated based on the characteristics or features of the image and the previously generated (pre-generated) words. Recently, the attention mechanism, which usually creates a spatial map that highlights the image areas associated with each word, is widely used in researches. In this paper also uses the encoder-decoder framework. The encoder part of our model uses ResNet101 extract the features and the decoder part of model uses three parts: Attention-LSTM, Language-LSTM, and Attention-Layer. This paper uses a attention mechanism that uses local evidence to better demonstrate image features. Our method was able to generate good captions and also improve the evaluation metrics of METEOR, ROUGH.