PyTorch錯誤解決實戰:RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [6, 1]], which is output 0 of TBackward, is at version 4; expected version 3 instead.

1. Segmentation fault (core dumped)

#一次分析一顆GPU,從0、1、2、依序到3
CUDA_VISIBLE_DEVICES=0 python
import torch;torch.cuda.current_device()
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
torch.rand(10).to(device)
# 結果發現第3顆GPU會出現Segmentation fault (core dumped),因此不使用第3顆
https://github.com/pytorch/pytorch/issues/926#issuecomment-284239111sudo apt update
sudo apt install gdb
gdb python
r main.py --cfg ./config/Phison/feat_uniform_resnet18.yaml
where

2. 官方指定用1.4版本的PyTorch,我卻使用1.7.1版本訓練

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [6, 1]], which is output 0 of TBackward, is at version 4; expected version 3 instead.
pip install torch==1.4.0+cu92 torchvision==0.5.0+cu92 -f https://download.pytorch.org/whl/torch_stable.htmlpip install higher==0.2.1
節錄自113. 【torch】反向传播弃inplace操作1)torch版本降为0.3.0(不成功)2)在inplace为True的时候,将其改为Flase,如drop()3)去掉所有的inplace操作4)换掉”-=”“+=”之类的操作,且用b=a代替a = a
a -=c
==>
b = a.clone() # tensor复制方式
a = b — c
避免a.operate(**)不赋值的情况等等
作者:十里江城
链接:https://www.jianshu.com/p/9fb0e354c278
来源:简书
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。

--

--

Machine Learning | Deep Learning | https://linktr.ee/yanwei

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store