simple_decoding_loop#
A simple decoding loop for the Gemma model.
This can be used to sample from Gemma in decoding mode, and can also be used as a starting point for more sophisticated sampling algorithms.
Classes
State that manages the set decoded tokens during sampling. |
Functions
|
Advances a sampling state by one token. |
|
Prefills the key-value caches based on a prompt. |
|
Runs temperature sampling in a Python for loop. |