GemmaTransformerBlock#

class penzai.example_models.gemma.model_core.GemmaTransformerBlock[source]#

Bases: Sequential

Main decoder block for the Gemma transformer architecture.

GemmaTransformerBlock is a tagged alias of pz.nn.Sequential, which means it just runs its sublayers in sequence. However, it has its own type to make it easier to identify with selectors, and also can be constructed from a GemmaTransformerConfig.

Methods

__init__(sublayers)

from_config(config)

Builds a GemmaTransformerBlock from a configuration.

Attributes

sublayers

Inherited Methods

(expand to view inherited methods)

attributes_dict()

Constructs a dictionary with all of the fields in the class.

from_attributes(**field_values)

Directly instantiates a struct given all of its fields.

input_structure()

Returns the input structure of this layer.

key_for_field(field_name)

Generates a JAX PyTree key for a given field name.

output_structure()

Returns the output structure of this layer.

select()

Wraps this struct in a selection, enabling functional-style mutations.

tree_flatten()

Flattens this tree node.

tree_flatten_with_keys()

Flattens this tree node with keys.

tree_unflatten(aux_data, children)

Unflattens this tree node.

treescope_color()

__call__(value)

Runs each of the sublayers in sequence.

classmethod from_config(config: GemmaTransformerConfig) GemmaTransformerBlock[source]#

Builds a GemmaTransformerBlock from a configuration.

Parameters:

config – The configuration of the Gemma model.

Returns:

A GemmaTransformerBlock with uninitialized parameters.