Skip to content

Conversation

@casteryh
Copy link
Contributor

@casteryh casteryh commented Nov 10, 2025

Summary

This change adds optional parameters to Generator.update_weights() to support in-flight weight updates without waiting for pending requests to complete.

Changes

  • Added wait_for_pending and reset_cache parameters to Generator.update_weights()
  • Wrapped blocking logic in conditional statements based on wait_for_pending
  • Made KV cache reset conditional based on reset_cache
  • Enhanced documentation with usage examples

Test Plan

  • Verified Python syntax compiles successfully
  • Default behavior unchanged (backwards compatible)
  • New behavior available via explicit parameter flags

Summary:
This change adds optional parameters to Generator.update_weights() to support
in-flight weight updates without waiting for pending requests to complete.

The original blocking behavior is preserved as default, with new opt-in
parameters:
- wait_for_pending (default=True): When False, updates weights immediately
  without draining the request queue
- reset_cache (default=True): When False, preserves KV cache during updates

This enables faster weight updates during training at the cost of potential
mid-generation weight switching for in-flight requests.

Test Plan:
- Verified Python syntax compiles successfully
- Default behavior unchanged (backwards compatible)
- New behavior available via explicit parameter flags
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 10, 2025
Summary:
Added two new configuration fields to the Generator dataclass:
- wait_for_pending_on_update (default=True): Controls whether weight updates
  wait for pending requests to complete
- reset_cache_on_update (default=True): Controls whether to reset KV cache
  after weight updates

The update_weights() method now uses these config values as defaults, but
still allows per-call overrides via optional parameters.

This enables users to configure the behavior globally via config files while
maintaining flexibility for per-call customization.

Test Plan:
- Verified Python syntax compiles successfully
- Default behavior unchanged (backwards compatible)
- Config values can be set in Generator instantiation
- Per-call overrides still work as before
@casteryh casteryh changed the title [wip][do not review] Add switchable in-flight weight updates to Generator [wip][do not review] enable pipeline rl Nov 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant