model_rewiring#
Helper classes for rewiring, ablating, and intervening on model activations.
These helpers are intended to be inserted into a model to enable analysis of the causal impact of different model components. For instance, they can be used to ablate attention heads, to implement activation patching, or to linearize parts of a model for easier comparisons.
For an example of how to use these components, see the induction heads tutorial notebook.
Classes
A connection between two parallel computations. |
|
Layer that redirects masked-out heads to attend to the |
|
Linearizes and evaluates a model around two adjusted inputs. |
|
Rewires computation across parallel model runs along a worlds axis. |