cs

object detection

    Grounded SAM

    Abstract Grounded SAM은 Grounding DINO와 SAM을 사용하여 open-voca segmentation model인 Grounded SAM을 제안 Motivation 기존의 open-world scenario에서 visual understanding에 대해서는 세 가지 방법론이 제안되어 왔다: Unified Model approach UNINEXT, OFA등이 해당하며, 다양한 vision task에 pretrain하는 것이다. 그러나 복잡한 task로 scability가 떨어진다는 문제가 있다. LLM as Controller method HuggingGPT, Visual ChatGPT, LLaVA-Plus가 해당하며, LLM을 이용하여 vision concept를 연결한다. ..

    [Object Detection] DETR

    End-to-End Object Detection with Transformers We present a new method that views object detection as a direct set prediction problem. Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components like a non-maximum suppression procedure or anchor gene arxiv.org End-to-End Object Detection with Transformers Abstract object detection을 direct set p..

    [ZSD] GLIP

    https://arxiv.org/abs/2112.03857 Grounded Language-Image Pre-training This paper presents a grounded language-image pre-training (GLIP) model for learning object-level, language-aware, and semantic-rich visual representations. GLIP unifies object detection and phrase grounding for pre-training. The unification brings two ben arxiv.org Abstract GLIP model 제안 object detection task와 phrase groundin..