Skip to content

Conversation

@copybara-service
Copy link

Allows non-owned arguments for attention methods.

  • Adds and uses a new AttentionActivationPtrs that holds non-owning MatPtrs. Acts as a view into AttentionActivations.
  • Updates QBatch to hold non-owning MatPtrs to the kv caches.
  • Enables the MatPtrT default constructor for simpler initializations.
  • Pulls out and passes LayerWeightsPtrs::query_norm_scale directly. While LayerWeightsPtrs already held non-owning MatPtrs, this change avoids the need to find and construct several empty weight tensors just to construct one query_norm_scale tensor.

* Adds and uses a new `AttentionActivationPtrs` that holds non-owning `MatPtrs`. Acts as a view into `AttentionActivations`.
* Updates `QBatch` to hold  non-owning `MatPtr`s to the kv caches.
* Enables the `MatPtrT` default constructor for simpler initializations.
* Pulls out and passes `LayerWeightsPtrs::query_norm_scale` directly. While `LayerWeightsPtrs` already held non-owning `MatPtr`s, this change avoids the need to find and construct several empty weight tensors just to construct one `query_norm_scale` tensor.

PiperOrigin-RevId: 823702392
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant