如何確保PyTorch訓練的結果具有再現性?

Yanwei Liu
Jul 16, 2021

有時在訓練模型的時候,經過多次訓練下來的結果都不太一樣,本文將提供參數設定,使訓練結果具有再現性(Reproducibility)

1. 訓練模型前預先設定seed

import torch
import numpy as np
def set_seed(seed=42, loader=None):
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
if torch.cuda.is_available():
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True
try:
loader.sampler.generator.manual_seed(seed)
except AttributeError:
pass

2. DataLoader

def seed_worker(worker_id):
worker_seed = torch.initial_seed() % 2**32
numpy.random.seed(worker_seed)
random.seed(worker_seed)

g = torch.Generator()
g.manual_seed(0)

DataLoader(
train_dataset,
batch_size=batch_size,
num_workers=num_workers,
worker_init_fn=seed_worker
generator=g,
)

參考資料

Reproducibility — PyTorch 1.9.0 documentation

Random Seeds and Reproducibility. | by Daniel Godoy | May, 2022 | Towards Data Science

--

--