'분류 전체보기' 카테고리의 글 목록 (10 Page)

End-to-End Object Detection with Transformers We present a new method that views object detection as a direct set prediction problem. Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components like a non-maximum suppression procedure or anchor gene arxiv.org End-to-End Object Detection with Transformers Abstract object detection을 direct set p..

https://arxiv.org/abs/2112.03857 Grounded Language-Image Pre-training This paper presents a grounded language-image pre-training (GLIP) model for learning object-level, language-aware, and semantic-rich visual representations. GLIP unifies object detection and phrase grounding for pre-training. The unification brings two ben arxiv.org Abstract GLIP model 제안 object detection task와 phrase groundin..

https://arxiv.org/abs/2104.14294 Emerging Properties in Self-Supervised Vision TransformersIn this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets). Beyond the fact that adapting self-supervised methods to this architecture works particarxiv.orgAbstractself-supervised learning이 ViT에 새로운..

보호되어 있는 글입니다.

Distributed package doesn't have NCCL built in Traceback (most recent call last): File "example_chat_completion.py", line 104, in fire.Fire(main) File "/home/csjihwanh/Desktop/projects/sggVQA/llama/env/lib/python3.8/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/csjihwanh/Desktop/projects/sggVQA/llama/env/lib/p..

* 원 논문에서의 Figure 번호는 괄호 안에 표기하였습니다. 1. Motivation ChatGPT나 GPT4 같은 모델을 보면, 복잡한 task에서는 좋은 성능을 보이지만 오히려 3자리 수 곱셈과 같은 간단한 task에서 fail하는 경우가 많다. 이 논문에서는 multi-hop reasoning을 통해 정답을 도출해야 하는 task를 compositional problem이라고 명명하고, 이를 통해 Transformer architecture가 가지고 있는 구조적인 한계점을 살펴본다. 이를 위해서 두 개의 hypothesis를 제시한다. 1. Transformers는 multi-step compositional reasoning을 linearized path matching으로 reduce해서 해..

개요 TPU(Tensor Processing Unit)은 Google에서 만든 ASICs(Application-Specific Integrated Circuit)이다. machine learning workload를 가속화하기 위해 사용된다. 보통 수천 개의 TPU는 함께 묶여 TPU Pod이라는 특별한 network를 구성한다. 예컨대, TPU v4에서 single v4 pod는 4096개의 TPU chips를 포함하고 있다. HBM 메모리를 활용하여 memory bandwidth가 큰 덕분에 batch size와 model이 큰 경우에도 효율적인 학습을 가능하게 한다. TPU는 2016년 Google I/O에서 처음 소개되었으며, Tensorflow를 위해 디자인되었다. 버전 업그레이드도 계속 이루어..

보호되어 있는 글입니다.

Introduction language model의 loss를 정의하는 것은 어려운 일이다. 단순한 token의 차이로는 좋은 loss를 얻기 어렵고, BLEU나 ROUGE로 측정하고 있지만 이 경우에도 단순 비교를 통해서 얻기 때문에 정확하지 않은 점이 많다. RLHF(Reinforcement Learning from Human Feedback)은 이런 점을 개선하여 사람의 feedback으로부터 모델이 학습할 수 있도록 한다. RLHF는 task alignment를 위해 사용하는 방법으로, pretrained LM에 대해 사용한다. 다양한 instruction에 대해서 반응할 수 있는 형태여야 한다. 그 다음, 사람의 preference를 반영한 reward model을 만든다. Reward Mode..

티스토리툴바