如何確保PyTorch訓練的結果具有再現性?
Jul 16, 2021
有時在訓練模型的時候,經過多次訓練下來的結果都不太一樣,本文將提供參數設定,使訓練結果具有再現性(Reproducibility)
1. 訓練模型前預先設定seed
import torch
import numpy as npdef set_seed(seed=42, loader=None):
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
if torch.cuda.is_available():
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True
try:
loader.sampler.generator.manual_seed(seed)
except AttributeError:
pass
2. DataLoader
def seed_worker(worker_id):
worker_seed = torch.initial_seed() % 2**32
numpy.random.seed(worker_seed)
random.seed(worker_seed)
g = torch.Generator()
g.manual_seed(0)
DataLoader(
train_dataset,
batch_size=batch_size,
num_workers=num_workers,
worker_init_fn=seed_worker
generator=g,
)
參考資料
Reproducibility — PyTorch 1.9.0 documentation
Random Seeds and Reproducibility. | by Daniel Godoy | May, 2022 | Towards Data Science