Python影像辨識筆記(二十四):DETR (End-to-End Object Detection with Transformers)

Yanwei Liu
3 min readNov 30, 2020

--

程式碼

Notebook

detr_demo.ipynb
- how to define the model, load pretrained weights and visualize bounding box and class predictions.
detr_hands_on.ipynb
- use the pre-trained models that we provide to make predictions.
- visualize the attentions of the model to gain insights on the way it sees the images.

論文

COCO資料集準備

可透過get_coco.sh取得dataset,但是必須自行將資料路徑設定成以下模式:path/to/coco/
annotations/ # annotation json files
train2017/ # train images
val2017/ # val images

使用方式

本文假設讀者已自行安裝PyTorch、TorchVision、Cython、SciPy,且只進行Object Detection而不進行Segmentation

git clone https://github.com/facebookresearch/detr.gitpip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'cd detr#數字4代表使用的GPU數量、/path/to/coco請設定成自己的coco資料集路徑#訓練python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --coco_path /path/to/coco#官方宣稱每個epoch約花費28分鐘,預設300個epoch在8顆V100 GPU的情況下,大約需要6天的訓練時間。#測試AP
python main.py --batch_size 2 --no_aux_loss --eval --resume https://dl.fbaipublicfiles.com/detr/detr-r50-e632da11.pth --coco_path /path/to/coco
測試不同model的指令

--

--

No responses yet