[2019 CVPR] Patch-based Discriminative Feature Learning for Unsupervised Person Re-identification

Table of content (short-version) [paper] [github]

Summary
References

Summary

Patch-based + feature-learning + unsupervised 방식
- Part-based랑 다른 점은 영상을 patch개념으로 비교
- PatchNet: patch 추출한 뒤에 unsupervised 방식으로 밀고 당기는 네트워크

[Patch-based 컨셉]

전체 프레임워크
- backbone에서 feature를 뽑은 다음에 sampler를 통해 patch generation.
- 패치마다 feature가 추출
- Patch에 적용되는 PEDAL loss와 patch feature를 합친 image feature에 적용되는 IPFL loss 존재
- PEDAL loss: 패치 단위로 비슷한 patch는 당기고 다른 patch는 미는 방식
- IPFL loss: 영상 단위로 real sample과 surrogate positive sample은 당기고 hard negative sample은 미는 방식

[전체 프레임워크]

Patch generation network
- Spatial transformation network를 이용해서 feature-level에서 sampling
- Horizontal initialization한 뒤에 나중에 적응형
Patch-based discriminative feature learning
- 아래 그림에서 처럼 image-level로 하면 한계점이 존재
- 목표: patch 자체가 얼마나 discriminative 한지 판단하고 patch 마다 다른 특성이 들어가야 해서 다른 후처리를 적용
- Loss: 패치 단위로 비슷한 patch는 당기고 다른 patch는 미는 방식
  - Patch feature memory L2 loss 이용해서 m번째 patch 들끼리 비교
  - 줄을 세운 다음에 L2 distance가 가장 작은 K를 뽑음
  - Feature랑 dimension 같은 memory bank w를 이용 (memory bank는 영상/patch 마다 있는 것)
  - Memory bank 사용하는 이유: Mini-batch마다 PEDAL loss를 주는게 아니고, 안정성있게 training set 전체를 보는 memory를 사용하기 위해서

[Patch-based discriminative learning]

Image-level patch feature learning
- Image-level 가이드를 주는 loss이며 cyclic ranking은 triplet loss랑 비슷한데, P랑 N 어떻게 뽑는지가 핵심
- P는 원본 이미지를 transform해서, N은 re-ranking이랑 비슷하게

[Cyclic ranking]

Implementation
- 영상당 패치를 4/6/8개 사용
- Batch는 40개의 real sample 40개의 positive sample
- Labeled MSMT17에서 PatchNet pretrained

References

[1] Yang, Qize, et al. “Patch-Based Discriminative Feature Learning for Unsupervised Person Re-Identification.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.