fix(bug): Prevent node duplication during leftover node allocation #209
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📋 PR Title Format
The PR title should follow the format:
Where:
typeis one of:feat,fix,docs,refactor,perf,test,chore.scopeis optional and describes the part of the codebase affected (e.g.,auth,ui,api).concise messageis a short description of the change (max 50 chars).📝 Change Type
Please select the type of change this PR introduces (choose one or more):
💡 Description
I discovered a bug where self.nodes contains duplicate node entries after bootstrap() completes. This occurs when some nodes remain unallocated after the initial pipeline construction.
I am new to this so please let me know if this behavior is intended or i am thinking it wrong or this is indeed a bug.
The following test consistently produces 3 nodes instead of the expected 2:
Expect to see self.nodes to be
[Node(node_id='a100-0'...), Node(node_id='5090-1'...)]butDuring
global_allocation(), when nodes remain unallocated after pipeline construction:allocate_left_over_nodes()iterates through unallocated nodesjoin(node)→declare(node)for each nodedeclare()checks:if node.node_id not in self.node_id_to_nodenode_id_to_nodeis initialized as an empty dict in init, this check always passself.nodes.append(node)adds unallocated node again, creating a duplicateThe issue is that
node_id_to_nodeis never pre-populated with the initial nodes passed to the allocator, sodeclare()cannot detect that nodes already exist in self.nodes. This will happen on both allocator strategy if there are unallocated nodes left.Key Changes
node_id_to_nodedict inBaseLayerAllocator.__init__to preventdeclare()from re-adding existing nodes🔗 Related Issues
List any issues this PR closes or relates to:
Closes #123)✅ Checklist
Please ensure the following points are addressed before merging: