Python影像辨識筆記(二十四):DETR (End-to-End Object Detection with Transformers)

- how to define the model, load pretrained weights and visualize bounding box and class predictions.
- use the pre-trained models that we provide to make predictions.
- visualize the attentions of the model to gain insights on the way it sees the images.
annotations/ # annotation json files
train2017/ # train images
val2017/ # val images

本文假設讀者已自行安裝PyTorch、TorchVision、Cython、SciPy,且只進行Object Detection而不進行Segmentation

git clone install -U 'git+'cd detr#數字4代表使用的GPU數量、/path/to/coco請設定成自己的coco資料集路徑#訓練python -m torch.distributed.launch --nproc_per_node=4 --use_env --coco_path /path/to/coco#官方宣稱每個epoch約花費28分鐘,預設300個epoch在8顆V100 GPU的情況下,大約需要6天的訓練時間。#測試AP
python --batch_size 2 --no_aux_loss --eval --resume --coco_path /path/to/coco



Machine Learning | Deep Learning |

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store