Skip to content

MCG-NJU/SORCE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SORCE: Small Object Retrieval in Complex Environments

Chunxu Liu*Chi Xie*,  Xiaxu Chen,  Feng ZhuRui ZhaoLimin Wang
Nanjing University,   SenseTime Research

Overview

TL; DR. We introduce Small Object Retrieval in Complex Environments (SORCE) task, which is a new subfield of T2IR, focusing on retrieving small objects in complex images.

Overview

We introduce a new dataset, SORCE-1K, comprising 1,023 image-text pairs in which each caption describes only a localized object region. This design explicitly avoids providing contextual clues from the broader scene, thereby preventing models from exploiting shortcut cues.

Additionally, we demonstrate that with the use of simple yet effective Regional Prompts (ReP), multimodal large language models (MLLMs) can accurately attend to and embed the corresponding image regions. Our fine-tuned models are available for evaluation here.

Dataset Preparation

Please download SORCE-1K dataset from Hugging Face and place it in the datasets folder.

mkdir datasets
huggingface-cli download --repo-type dataset --resume-download lcxrocks/sorce-1k --local-dir ./datasets/sorce-1k

Environment Setup

Please make sure the transformers version is compatible.

conda create -n sorce python=3.11
pip install -r requirements.txt

Evaluation

To evaluate the model, please run the following command, which will download the 🤗hugginface pretrained model.

bash dist_eval.sh

Citation

If you think this project is helpful in your research or for application, please feel free to leave a star⭐️ and cite our paper:


@misc{liu2025sorcesmallobjectretrieval,
      title={SORCE: Small Object Retrieval in Complex Environments}, 
      author={Chunxu Liu and Chi Xie and Xiaxu Chen and Wei Li and Feng Zhu and Rui Zhao and Limin Wang},
      year={2025},
      eprint={2505.24441},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2505.24441}, 
}

License and Acknowledgement

This project is released under the Apache 2.0 license. The codes are based on E5-V. Please also follow their licenses. Thanks for their awesome work!

About

Small Object Retrieval in Complex Environments (SORCE)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published