Why model.enable_input_require_grads()?

本文最后更新于:May 12, 2025 pm

What happens when using LoRA?

It starts with the error

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

when you have part of the parameters activated while the others(especially those in the second half of the model) are frozen.

What PEFT’s method get_peft_model() does is freezing the model’s pre-trained parameters and add a LoRA adapter to it, while usually, we don’t activate the embedding layer and causal language model head. And that’s basically the core reason.

The Pytorch’s computational graph doesn’t have the gradient it requires to complete the back propagation.

See the GitHub issue: https://github.com/huggingface/peft/issues/137

Solution

The solution is simple, huggingface already provided a method to enable the input gradient.

1
model.enable_input_require_grads()

call it before training.


Why model.enable_input_require_grads()?
https://jesseprince.github.io/2025/05/12/llms/troubleshoot/lora_input_grad/
Author
林正
Posted on
May 12, 2025
Licensed under