Click here to flash read.
Memes can sway people's opinions over social media as they combine visual and
textual information in an easy-to-consume manner. Since memes instantly turn
viral, it becomes crucial to infer their intent and potentially associated
harmfulness to take timely measures as needed. A common problem associated with
meme comprehension lies in detecting the entities referenced and characterizing
the role of each of these entities. Here, we aim to understand whether the meme
glorifies, vilifies, or victimizes each entity it refers to. To this end, we
address the task of role identification of entities in harmful memes, i.e.,
detecting who is the 'hero', the 'villain', and the 'victim' in the meme, if
any. We utilize HVVMemes - a memes dataset on US Politics and Covid-19 memes,
released recently as part of the CONSTRAINT@ACL-2022 shared-task. It contains
memes, entities referenced, and their associated roles: hero, villain, victim,
and other. We further design VECTOR (Visual-semantic role dEteCToR), a robust
multi-modal framework for the task, which integrates entity-based contextual
information in the multi-modal representation and compare it to several
standard unimodal (text-only or image-only) or multi-modal (image+text) models.
Our experimental results show that our proposed model achieves an improvement
of 4% over the best baseline and 1% over the best competing stand-alone
submission from the shared-task. Besides divulging an extensive experimental
setup with comparative analyses, we finally highlight the challenges
encountered in addressing the complex task of semantic role labeling within
memes.
No creative common's license