Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add MaskedRepetitionContext for VLM Image Token Exclusion
Overview
This PR introduces
MaskedRepetitionContext, a newLogitProcessorthat extends the existing repetition penalty functionality to support excluding specific tokens (such as image tokens in Vision-Language Models) from repetition penalties.Problem
In Vision-Language Models (VLMs), image patch tokens often need to repeat naturally to represent visual content. The existing
RepetitionContextapplies penalties to all repeated tokens, which can degrade VLM performance by incorrectly penalizing legitimate image token repetitions.Solution
MaskedRepetitionContextaccepts a boolean mask array that identifies which tokens should be excluded from repetition penalty calculation, allowing:Basic Usage Example
Integration with TokenIterator
Files Changed
MaskedRepetitionContextimplementationTests/MLXLMTests/RepetitionPenaltyTests.swift: Comprehensive test suitemlx-swift-examples.xcodeproj/project.pbxproj: Added test file to build systemKey Features
✅ Backward Compatible: Implements same
LogitProcessorinterface asRepetitionContext✅ Flexible Masking: Support any token types that should be excluded from penalty
✅ Efficient Implementation: Uses circular buffer with O(1) operations
✅ VLM Optimized: Designed specifically for Vision-Language Model requirements
✅ Comprehensive Testing: Full test coverage including edge cases
Testing
Running the Tests
To run the comprehensive test suite for repetition penalty functionality:
What We're Testing
The test suite (
RepetitionPenaltyTests.swift) validates:testBasicRepetitionContext: Verifies existingRepetitionContextfunctionality remains intacttestMaskedRepetitionContextBasic: Tests basic masking behavior - masked tokens are excluded from penaltytestMaskedRepetitionContextAllMasked: Edge case where all tokens are masked (no penalties applied)testMaskedRepetitionContextDuringGeneration: Complex scenario simulating actual generation with mixed masked/unmasked tokenstestMaskedRepetitionContextCircularBuffer: Validates circular buffer behavior when context window is exceededtestMaskedRepetitionContextFallbackBehavior: Tests backward compatibility when no mask is providedtestMaskedRepetitionContextPreconditions: Validates error handling and input validationtestComparisonBetweenProcessors: Direct comparison betweenRepetitionContextandMaskedRepetitionContextbehaviorTest Coverage Highlights
Benefits for VLMs
Breaking Changes
None. This is a purely additive feature that maintains full backward compatibility with existing
RepetitionContextusage.