TimeZero: Temporal Video Grounding with Reasoning-Guided LVLM

Created by MG96

External Public cs.CV cs.AI cs.CL

Statistics

Citations
0
References
0
Last updated
Loading...
Authors

Ye Wang Boshen Xu Zihao Yue Zihan Xiao Ziheng Wang Liang Zhang Dingyi Yang Wenxuan Wang Qin Jin
Project Resources

Name Type Source Actions
ArXiv Paper Paper arXiv
Abstract

We introduce TimeZero, a reasoning-guided LVLM designed for the temporal video grounding (TVG) task. This task requires precisely localizing relevant video segments within long videos based on a given language query. TimeZero tackles this challenge by extending the inference process, enabling the model to reason about video-language relationships solely through reinforcement learning. To evaluate the effectiveness of TimeZero, we conduct experiments on two benchmarks, where TimeZero achieves state-of-the-art performance on Charades-STA. Code is available at https://github.com/www-Ye/TimeZero.

Note:

No note available for this project.

No note available for this project.
Contact:

No contact available for this project.

No contact available for this project.