Abstract: As a fundamental and challenging task in bridging language and vision domains, Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the ...
Abstract: Text-to-image person re-identification (TIReID) is a cross-modal retrieval task that aims to retrieve target person images based on a given text description. Existing methods primarily focus ...