'분류 전체보기' 카테고리의 글 목록 (11 Page)

visual genome dataset은 문제가 많다. . API 지원도 2023년부로 중단되었다. relationship의 index가 밀리는 문제 이거 때문에 학습이 잘 안 됐었는데 뒤쪽으로 갈 수록 VG-SGG.h5에서 얻은 vg-sgg['img_to_first_rel'][img_idx]와 vg_sgg['img_to_last_rel'][img_idx]의 relation list와 img_idx가 안 맞기 시작해서 나중엔 4개까지 차이가 난다. 이거 때문에 학습이 잘 안 되어서 고민이 많았는데, 이런 식이다. 하나의 relationship의 두 개의 bounding box를 그리라고 한 건데, 뭘 그린 box인지 알 수 없다. 각각 label은 813158 cup과 813164 plate로 출력된다...

https://arxiv.org/abs/1809.07041 Exploring Visual Relationship for Image Captioning It is always well believed that modeling relationships between objects would be helpful for representing and eventually describing an image. Nevertheless, there has not been evidence in support of the idea on image description generation. In this paper, we arxiv.org 논문을 재구현한다. Semantic Relationship Graph Res4b22 ..

https://arxiv.org/abs/2010.11929 An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to rep arxiv.org Abstract 이 논문이 나온 21년도 ..

https://arxiv.org/abs/1809.07041 Exploring Visual Relationship for Image Captioning It is always well believed that modeling relationships between objects would be helpful for representing and eventually describing an image. Nevertheless, there has not been evidence in support of the idea on image description generation. In this paper, we arxiv.org Abstract GCN과 LSTM을 결합한 architecture를 이용하여 sema..

https://arxiv.org/abs/1905.00067 MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing Existing popular methods for semi-supervised learning with Graph Neural Networks (such as the Graph Convolutional Network) provably cannot learn a general class of neighborhood mixing relationships. To address this weakness, we propose a new model, MixHop, arxiv.org Summary ..

Introduction Minibatch Discrimination은 mode collapse(the Helvetica Scenario)를 해결하기 위한 방법 중 하나이다. Salimans et al.(2016)에 의해 제안되었다. 자세한 내용은 Improved Techniques for Training GANs를 참조하면 된다. GAN은 다른 deep learning model처럼 cost function이 낮아지는 값을 찾는 것이 아니라, generator와 discriminator가 Nash equilibrium을 가지는 지점을 찾는 것이 중요하다. 따라서 일반적인 gradient descent algorithm은 잘 converge하지 못한다. 따라서 위 논문에서는 여러가지 heuristic한 방..

Introduction mode collapse는 generator가 several different input $z$를 same output point로 mapping하는 문제를 의미한다. 지금 내가 하고 있는 프로젝트에서 generator에 gaussian distribution(0,1)을 따르는 $z$를 넣었을 때 variance가 큰 결과값이 나오길 기대했다. 하지만 실제로 얻은 것은 다음과 같다. 목표했던 퀄리티의 image를 얻을 수 없기도 했지만, Figure 1에서 볼 수 있는 것처럼 거의 비슷한 이미지가 나온다. 학습 과정에서 사용한 gaussian distribution은 mean = 0에 std = 0.02를 사용하니까, 이 generated image에서는 $N(0,1)$을 사용하므..

보호되어 있는 글입니다.

티스토리툴바