-
Notifications
You must be signed in to change notification settings - Fork 6.1k
8358725: RunThese30M: assert(nm->insts_contains_inclusive(original_pc)) failed: original PC must be in the main code section of the compiled method (or must be immediately following it) #27985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
👋 Welcome back dlong! A progress list of the required criteria for merging this PR into |
|
❗ This change is not yet ready to be integrated. |
|
@dean-long The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
|
If we only the print the stack up to the signal handler, and not all the way to the allocation, then won't the resulting stack trace be confusing for the reader?
Yeah and it really should not do that. :( |
The stack trace starts at the most recent stack frames (the allocation) and works backwards through the callers (signal handler). So the signal handler frame looks like a special entry or "first" frame, similar to a thread start function or libc init code. |
I agree, but I am not addressing that in this fix. There might be other legitimate reasons for wanting a stack trace during a signal handler, other than NMT allocation tracking, though off the top of my head I can't think of any :-) |
Sorry I'm confused about which part of the stack - that preceding the signal handler, or that after - will be printed after this fix. |
After, temporally. The history before the signal handler happened is erased. This should match the meaning of os::is_first_C_frame(). |
| // We are called from a signal handler, so stop the stack backtrace here. | ||
| // See os::is_first_C_frame() in os::get_native_stack(). | ||
| os::FirstNativeFrameMark fnfm; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this break stack-walking in hs_err file generation when we get a SEGV for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I suppose so. Good catch. We shouldn't consider the "first frame" marker if we are starting before it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hs_err stack trace only seems to report from before the signal handler, though I'm unclear if that is because the signal context sets the initial frame, or because we skip over things till we get to that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we call report_and_die from the signal handler we use the signal context as the starting point to walk the stack. So it shouldn’t affect the hs_err stack trace unless we crashed in the signal handler itself somewhere before calling report_and_die (and that we don’t hit the second time). I guess we still want to support that case. The other case I was thinking was if we call report_and_die due to hitting an assert and there is no context set, but then we crash before printing the stack. So the second attempt to print the stack is done within the signal handler and VMError::_context is still nullptr (only set first time), so we start walking from the current frame. I tested this case with this patch applied and we walk the full stack including the signal handler fine. I was confused first but then realized we execute the secondary signal handler, which doesn’t have this mark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can confirm David's concern, than if we get a SEGV, then the stack backtrace only contains a single frame. One solution would be to do something more like anchor frames, which are chained. Imagine wanting to start a stack walk in the middle of the stack between anchor frame A and anchor frame B, and stop at the anchor frame boundary. That is comparable to what error reporting is attempting to do by providing a saved context as a starting point.
The problem is code called from a signal handler, like SharedRuntime::handle_unsafe_access(), can call os::malloc(), and when NMT is enabled, we try to get a stack backtrace. But os::get_native_stack() does not know how to walk through signal handler frames.
This fix introduces FirstNativeFrameMark to be used by the POSIX version of os::get_native_stack() to set a frame to stop at in the POSIX signal handler.
Progress
Issue
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27985/head:pull/27985$ git checkout pull/27985Update a local copy of the PR:
$ git checkout pull/27985$ git pull https://git.openjdk.org/jdk.git pull/27985/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 27985View PR using the GUI difftool:
$ git pr show -t 27985Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27985.diff
Using Webrev
Link to Webrev Comment