Skip to content

Conversation

@slinder1
Copy link

@slinder1 slinder1 commented Nov 7, 2025

The existing check for this case only comes after a derefence of what can be an iterator sentinel (leading to an assert).

This may not be purely NFC in that it also avoids queuing the effectively-empty region for rescheduling, but AFAICT this should be purely an optimization.

Testing this seems difficult, as the high-level scheduler avoids scheduling these "empty" regions. This means a reproducer has to depend on behavior of the scheduler passes before PreRARematStage in order to craft a region which triggers the bug.

Since this is a release blocker I am posting a PR now, as both Shore Shen and I have manually verified that this resolves the particular crash from SWDEV-564142 but I am still working on making a reasonable test.

(cherry picked from commit 004cfea)

The existing check for this case only comes after a derefence of what
can be an iterator sentinel (leading to an assert).

This may not be purely NFC in that it also avoids queuing the
effectively-empty region for rescheduling, but AFAICT this should be
purely an optimization.

Testing this seems difficult, as the high-level scheduler avoids
scheduling these "empty" regions. This means a reproducer has to depend
on behavior of the scheduler passes before PreRARematStage in order to
craft a region which triggers the bug.

Since this is a release blocker I am posting a PR now, as both Shore
Shen and I have manually verified that this resolves the particular
crash from SWDEV-564142 but I am still working on making a reasonable
test.

(cherry picked from commit 004cfea)
@z1-cciauto
Copy link
Collaborator

@slinder1 slinder1 changed the title [AMDGPU] Handle empty-except-for-DI regions in PreRARematerialize [DoNotMerge][AMDGPU] Handle empty-except-for-DI regions in PreRARematerialize Nov 7, 2025
@slinder1 slinder1 changed the title [DoNotMerge][AMDGPU] Handle empty-except-for-DI regions in PreRARematerialize [AMDGPU] Handle empty-except-for-DI regions in PreRARematerialize Nov 7, 2025
@bcahoon bcahoon self-requested a review November 7, 2025 22:14
@arsenm
Copy link

arsenm commented Nov 8, 2025

Testcase?

@arsenm
Copy link

arsenm commented Nov 8, 2025

Why is this direct to mainline?

@ronlieb
Copy link
Collaborator

ronlieb commented Nov 9, 2025

!PSDB

@z1-cciauto
Copy link
Collaborator

@ronlieb ronlieb merged commit 5961ba0 into amd-mainline Nov 9, 2025
9 checks passed
@ronlieb ronlieb deleted the amd/dev/slinder1/amd-mainline/SWDEV-563725_rematerialize_debug_info branch November 9, 2025 18:08
@slinder1
Copy link
Author

Why is this direct to mainline?

Originally merged into amd-staging @ #516 and I intend to go upstream, but as you note I don't have a reasonable testcase yet.

The issue with building a test is that this affects a phase of the scheduler in a way that depends on the details of earlier phases. There is also a lot of IR to churn through trying to get to a reproducer, and naively running llc or llc -mcpu=gfx1030 -mtriple=amdgcn-amd-amdhsa -O3 -amdgpu-spill-cfi-saved-regs --frame-pointer=none on the intermediate IR doesn't reproduce the issue.

I'm still working on it, though

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants