• Deep Reinforcement Learning for Query-Conditioned Video Summarization 

      Zhang, Yujia; Kampffmeyer, Michael C.; Zhao, Xiaoguang; Tan, Min (Journal article; Tidsskriftartikkel; Peer reviewed, 2019-02-21)
      Query-conditioned video summarization requires to (1) find a diverse set of video shots/frames that are representative for the whole video, and that (2) the selected shots/frames are related to a given query. Thus it can be tailored to different user interests leading to a better personalized summary and differs from the generic video summarization which only focuses on video content. Our work targets ...
    • PSAIR: A Neuro-Symbolic Approach to Zero-Shot Visual Grounding 

      Pan, Yi; Zhang, Yujia; Kampffmeyer, Michael Christian; Zhao, Xiaoguang (Chapter; Bokkapittel, 2024-09-09)
      Supervised methods for Visual Grounding often require costly annotations of paired sentences and images with ground truth boxes. Recent zero-shot approaches to visual grounding such as ReCLIP and ChatRef aim to avoid the need for costly annotation of paired sentences and images with ground truth boxes. However, these approaches leverage an inflexible detect-then-reasoning paradigm, which leads to a ...