TL;DR:
Use both. They do different things, and have different scopes.
with torch.no_grad
- disables tracking of gradients in autograd
.
model.eval()
changes the forward()
behaviour of the module it is called upon
- eg, it disables dropout and has batch norm use the entire population statistics
with torch.no_grad
The torch.autograd.no_grad
documentation says:
Context-manager that disabled [sic] gradient calculation.
Disabling gradient calculation is useful for inference, when you are sure that you will not call Tensor.backward()
. It will reduce memory consumption for computations that would otherwise have requires_grad=True
. In this mode, the result of every computation will have requires_grad=False
, even when the inputs have requires_grad=True
.
model.eval()
The nn.Module.eval
documentation says:
Sets the module in evaluation mode.
This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout
, BatchNorm
, etc.
The creator of pytorch said the documentation should be updated to suggest the usage of both, and I raised the pull request.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…