Skip to content

Conversation

owena11
Copy link

@owena11 owena11 commented Oct 17, 2025

This is a proposed fix to #10862 to allow consistent usage between sel and drop_sel for MultiIndexes. The current implementation seems to convert the labels to an array to handle the edge case of labels of type xr.DataArray. This change makes that edge case more explict and delegates the reponsibility for converting to an array if needed down to the indexer.

Exisitng indexes already do this and so the change shouldn't have any preformance implications:

Most labels passed into `drop_sel` can be handled by the underlying
libraries, and will covert to an array as the current implementation
does. xr.DataArray is a special case that is supported as set of labels
but doesn't interact well with pandas coversion to an array.
@welcome
Copy link

welcome bot commented Oct 17, 2025

Thank you for opening this pull request! It may take us a few days to respond here, so thank you for being patient.
If you have questions, some answers may be found in our contributing guidelines.

@owena11 owena11 changed the title Fix drop_sel MultiIndex Fix drop_sel for a MultiIndex Oct 17, 2025
Data variables:
A (x, y) int64 32B 0 2 3 5
"""
from xarray.core.dataarray import DataArray
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've imported DataArray here as that seems to fit with the style in other parts of the codebase (here), however I'm not entirely sure why this is done.

@max-sixty
Copy link
Collaborator

I don't know this area that well, though it does seem to solve the immediate problem.

would anyone have alternative approaches that get at the root of the issue?

otherwise I would suggest merging this, maybe with a comment explaining, and then nothing prevents refactoring later...

(adding @benbovy , the indexing Czar, though others will also know more)

@owena11
Copy link
Author

owena11 commented Oct 20, 2025

I don't know this area that well,

I have to admit the same! To try and provide more context I've dived down the git blame rabbit hole to explain the original reason for the cast to an array type within the code.

It was introduced in this PR #3177. The PR intorduces a whole bunch of type hints for the old drop method before the split to drop_vars / drop_sel refereced indexes: 'OrderedDict[Any, pd.Index]'. At that time the Index.drop didn't have a type signature apart from a docstring in pandas unless they're stored elsewhere. So the best theory I have is that is was related to matching the types from somewhere in the chain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Differing behaviour between sel and drop_sel for MultiIndexes

2 participants