gpt_neox_from_huggingface_model#
- penzai.experimental.v2.models.transformer.variants.gpt_neox.gpt_neox_from_huggingface_model(model: GPTNeoXForCausalLM, upcast_activations_to_float32: bool = False, use_layer_stack: bool = False) model_parts.TransformerLM[source]#
Converts a GPT-NeoX model to a Penzai model.
This function converts GPT-NeoX models from their HuggingFace implementations to Penzai.
Note: Checkpoint conversion is only implemented for the most common set of hyperparameters for GPT-NeoX models, including GPT-NeoX-20B and the Pythia scaling suite.
- Parameters:
model – The HuggingFace Llama or Mistral model.
upcast_activations_to_float32 – Whether to cast activations to float32 when the model runs. This allows analyzing activations at higher precision without consuming additional memory for parameters.
use_layer_stack – Whether to use a layer stack for the decoder blocks.
- Returns:
A Transformer model containing the loaded parameters.