GemmaAttention#

class penzai.example_models.gemma.model_core.GemmaAttention[source]#

Bases: Attention

Gemma-specific configuration of the self-attention layer.

GemmaAttention has the same runtime behavior as the base pz.nn.Attention combinator, but adds a classmethod that constructs the layer according to the Gemma architecture.

Methods

__init__(input_to_query, input_to_key, ...)

from_config(config)

Builds a GemmaAttention block from a configuration.

Attributes

input_to_query

input_to_key

input_to_value

query_key_to_attn

attn_value_to_output

Inherited Methods

(expand to view inherited methods)

attributes_dict()

Constructs a dictionary with all of the fields in the class.

from_attributes(**field_values)

Directly instantiates a struct given all of its fields.

input_structure()

Returns the input structure of this layer.

key_for_field(field_name)

Generates a JAX PyTree key for a given field name.

output_structure()

Returns the output structure of this layer.

select()

Wraps this struct in a selection, enabling functional-style mutations.

tree_flatten()

Flattens this tree node.

tree_flatten_with_keys()

Flattens this tree node with keys.

tree_unflatten(aux_data, children)

Unflattens this tree node.

treescope_color()

Computes a CSS color to display for this object in treescope.

__call__(x)

Runs the attention computation.

classmethod from_config(config: GemmaTransformerConfig) GemmaAttention[source]#

Builds a GemmaAttention block from a configuration.

Parameters:

config – The configuration of the Gemma model.

Returns:

A GemmaAttention block with uninitialized parameters.