simple_decoding_loop

simple_decoding_loop#

A simple decoding loop for the Gemma model.

This can be used to sample from Gemma in decoding mode, and can also be used as a starting point for more sophisticated sampling algorithms.

Classes

SamplingState

State that manages the set decoded tokens during sampling.

Functions

advance_one_token(model, state, next_token)

Advances a sampling state by one token.

prefill(model, initial_cache_state, prompt, ...)

Prefills the key-value caches based on a prompt.

temperature_sample_pyloop(model, ...[, ...])

Runs temperature sampling in a Python for loop.