KnockOutAttentionHeads#
- class penzai.deprecated.v1.toolshed.model_rewiring.KnockOutAttentionHeads[source]#
Bases:
LayerLayer that redirects masked-out heads to attend to the
<BOS>token.This layer can be inserted into a tramsformer model’s attention layer immediately after the softmax operation, in order to ablate a subset of the attention heads. It assumes that a reasonable “default” behavior for the head is to attend to the
<BOS>token, which is common for many attention heads. (This ablation may be less effective for heads that never attend toBOS.)- Variables:
head_mask (pz.nx.NamedArray) – NamedArray with 1s for heads we want to keep, and 0s for heads that should be rewritten to point to
BOS. Values between 0 and 1 will smoothly interpolate between them.
Methods
__init__(head_mask)__call__(attn_weights)Attributes
head_maskInherited Methods
(expand to view inherited methods)
attributes_dict()Constructs a dictionary with all of the fields in the class.
from_attributes(**field_values)Directly instantiates a struct given all of its fields.
input_structure()Returns the input structure of this layer.
key_for_field(field_name)Generates a JAX PyTree key for a given field name.
output_structure()Returns the output structure of this layer.
select()Wraps this struct in a selection, enabling functional-style mutations.
tree_flatten()Flattens this tree node.
tree_flatten_with_keys()Flattens this tree node with keys.
tree_unflatten(aux_data, children)Unflattens this tree node.
treescope_color()Computes a CSS color to display for this object in treescope.