Low-rank adaptation (LoRA, Hu et al. 2021).

Low-rank adaptation is a parameter-efficient fine-tuning strategy for large pretrained models. It works by decomposing each original linear operation into a pretrained full-rank matrix and two low-rank learnable updates.

This implementation is based on the tutorial “LoRA From Scratch”. Due to the design conventions of Penzai neural networks, it is straightforward to substitute LoRA blocks into any model that uses the pz.nn.Linear primitive layer.

See https://arxiv.org/abs/2106.09685 for details on LoRA.



A LoRA parameter-efficient adaptation block, replacing a Linear layer.


loraify_linears_in_selection(selection, rank)

Replaces Linear layers inside a selected part of a model with LoRA blocks.