gpt_neox_from_huggingface_model

gpt_neox_from_huggingface_model#

penzai.models.transformer.variants.gpt_neox.gpt_neox_from_huggingface_model(model: GPTNeoXForCausalLM, upcast_activations_to_float32: bool = False, use_layer_stack: bool = False) model_parts.TransformerLM[source]#

Converts a GPT-NeoX model to a Penzai model.

This function converts GPT-NeoX models from their HuggingFace implementations to Penzai.

Note: Checkpoint conversion is only implemented for the most common set of hyperparameters for GPT-NeoX models, including GPT-NeoX-20B and the Pythia scaling suite.

Parameters:
  • model – The HuggingFace GPT-NeoX model.

  • upcast_activations_to_float32 – Whether to cast activations to float32 when the model runs. This allows analyzing activations at higher precision without consuming additional memory for parameters.

  • use_layer_stack – Whether to use a layer stack for the decoder blocks.

Returns:

A Transformer model containing the loaded parameters.