Files
Abstract
Scene Graph Generation is a task in which descriptive depictions of images are generated. Majorindustry applications of scene graph generation include image captioning, Visual Question Answering
and gaming industries to detect graphics. Extracting graph representation, basically triplets can be a challenging task in Computer Vision. The most prevalent problem in scene graph
generation at present is severe training bias towards more frequently occurring entities and context that
interferes with the actual content. In this work, we combine two already existing state-of-the-art methods.
We bridge knowledge graphs with scene graphs to basically remove long tail distribution and add context
to the process of scene graph generation and then use counterfactual analysis to remove the contextual
bias introduced due to addition of external knowledge.