STVG (VidSTG, CVPR 2020)

2025. 1. 21. 15:41·DL·ML/Paper

https://arxiv.org/abs/2001.06891

 

 

Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form Sentences

In this paper, we consider a novel task, Spatio-Temporal Video Grounding for Multi-Form Sentences (STVG). Given an untrimmed video and a declarative/interrogative sentence depicting an object, STVG aims to localize the spatio-temporal tube of the queried o

arxiv.org

 

Abstract

  • STVG task 제안
  • VidSTG dataset
  • STGRN 구조 제안 

 

Motivation

(1) untrimmed video에서 spatiotemporal object tube를 localize한다. 이때의 object는 특정 동작을 하고 있는 경우에 대해서 설명한다. 

(2) declarative하게 object를 refer하는 것 뿐 아니라 interrogative하게 물체를 refer한다. 

 

 

 

Methods

Figure 2: The overall architecture of STGRN.

 

내 생각에 method는 outdated이므로 자세히 보는 것은 중요하지 않다. 

 

vision branch에서는 R-CNN으로 object region을 얻은 뒤 spatiotemporal graph를 만든다. 이를 query embedding과 cross-modal fusion해서 graph를 얻은 뒤 temporal → spatial 순으로 localize한다.

 

Dataset

VidSTG는 VidOR dataset을 기반으로 만들어졌다.

Table 1: Dataset Statistics about the Number of Declaritive and Interrogative Sentences.

 

 

Results

Figure 3: An example of the spatiotemporal grounding results.

 

 

 

Discussion

 

 


References

 

Footnotes

'DL·ML > Paper' 카테고리의 다른 글

TemporalVQA  (0) 2025.01.22
NExT-Chat (ICML 2024, MLLM for OD and Seg)  (0) 2025.01.22
LongVU (Long Video Understanding)  (0) 2025.01.20
LaSagnA (Segmentation)  (0) 2025.01.14
VideoRefer Suite  (0) 2025.01.10
'DL·ML/Paper' Other articles in this category
  • TemporalVQA
  • NExT-Chat (ICML 2024, MLLM for OD and Seg)
  • LongVU (Long Video Understanding)
  • LaSagnA (Segmentation)
Jordano
Jordano
  • Jordano
    Jordano
    Jordano
  • Total
    Today
    Yesterday
    • All categories
      • Introduction
      • Theatre⋅Play
      • Thinking
        • iDeAs
        • Philosophy
      • History
        • Cuba
        • China
      • CS
        • HTML·CSS·JavaScript
        • Dart·Flutter
        • C, C++
        • Python
        • PS
        • Algorithm
        • Network
        • OS
        • etc
      • DL·ML
        • Paper
        • Study
        • Project
      • Mathematics
        • Information Theory
        • Linear Algebra
        • Statistics
        • etc
      • etc
        • Paper
      • Private
      • Travel
  • Blog Menu

    • 홈
    • 태그
    • 방명록
  • Link

  • hELLO· Designed By정상우.v4.10.3
Jordano
STVG (VidSTG, CVPR 2020)
상단으로

티스토리툴바