-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NativeAOT] thread suspension on Unix #71187
Conversation
d5325b2
to
e050f9b
Compare
// enum used by the runtime. | ||
GCRefKind GetGcRefKind(ReturnKind returnKind) | ||
{ | ||
static_assert((GCRefKind)ReturnKind::RT_Scalar == GCRK_Scalar, "ReturnKind::RT_Scalar does not match GCRK_Scalar"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unix x64 ABI can return values in two registers, and so there are variants like RT_Scalar_Obj
.
Do we need to handle them here? Also, it may be interesting to look into why you have not seen this problem in the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lab testing that is currently enabled for this on Unix is fairly limited, I believe, and I have a limited capacity to stress-test the change locally so we may miss some cases.
The static assert tests some cases, but GCRefKind
translation is basically one-to-one, not even sure if the assert helps much here.
I am not changing GC decoding, so it may just work if it is the same data just stored/read at different location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps more expanded libs testing will provide failures and repro cases.
I am not yet sure what I'd be adding here for multreg returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thread::InternalHijack
needs to be fixed to hande the new cases.
You can try to build a targeted repro that has a lot of methods that return struct { int a; object b; }
. Also, check the generated code to make sure that this struct is indeed returned in rax
and rdx
registers (object b
should be in rdx).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will take a look. Thanks!
I kind of assumed that since we can stackwalk in scenarios that already work, we may already be handling all kinds of cases, but this is a new scenario.
/azp run runtime-extra-platforms |
Azure Pipelines successfully started running 1 pipeline(s). |
I do not think they are related to the change. |
@@ -318,7 +320,7 @@ C_FUNC(\Name): | |||
DEFAULT_FRAME_SAVE_FLAGS = PTFF_SAVE_ALL_PRESERVED + PTFF_SAVE_RSP | |||
|
|||
.macro PUSH_COOP_PINVOKE_FRAME trashReg | |||
push_nonvol_reg rbp // push RBP frame | |||
push_nonvol_reg rbp // push RBP frame // TODO: do we need this? not on windows. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the added TODO: I believe this is needed for unwindability. I would think that on Windows, it is needed too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These macros are used to call GC helpers, not the user code, so I assume unwindability here would be useful for crash dumps and such.
The Windows version seems to be using RSP-based frame style, which makes it simpler as RBP is just a nonvol register. I think the pattern is unwindable, at least as much as needed for debugging.
I will keep the TODO here. I will be doing some cleanup of assembly helpers. I think we have a good deal of redundancy and will take a look at this as a part of cleanup.
Ideally we should use the same code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant unwinding when debugging the code (and core dump too.)
This macro was actually changed to use RBP based frame by @jkotas in dotnet/corert#4812
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Thanks! Saves me time to figure if the alternative works with debuggers (I now know it does not).
I will change Windows to use the same code then (in a separate change). Even if Windows is resilient to this, it is nicer to have the same code where possible.
|
||
// if we are in an epilogue, there will be at least one instruction left. | ||
int trailingEpilogueInstructions = 1; | ||
uint8_t* NextByte = (uint8_t*)pvAddress; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nit - can you please fix the naming convention of some of the locals (start with lowercase)?
Yes, that's #72352 |
ASSERT(returnKind <= GCRK_LastValid); | ||
return (GCRefKind)(returnKind & (GCRK_Object | GCRK_Byref)); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ExtractReg1ReturnKind can be moved here as well.
{ | ||
reinterpret_cast<void*>(RhpGcProbeHijack) | ||
}; | ||
#else // TARGET_ARM64 || TARGET_UNIX | ||
EXTERN_C void FASTCALL RhpGcProbeHijackScalar(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to switch the other platforms to the one-and-only hijack target plan (in separate PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It occured to me that using multiple probe targets on win-x64 as a way to pass return kind to the probe is not very rewarding optimization.
Perf of suspension code would be gated by how many syscalls we make and how many iterative attempts we do, so benefits here would be fairly small in comparison.
// | ||
// Such an opcode is an unambiguous epilogue indication. | ||
// | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to check for a breakpoint and give up if you see one.
(It is annoying that setting a breakpoint introduces GC crash.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or print debug output if you run into a breakpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 'int 3' in regular code will be naturally handled since it is not recognized as a part of epilogue anyways.
Placing a breakpoint in epilogue might be rare and would be hard to handle reliably, since I would not know what it replaced.
I will have to think about this. I can treat int 3
as a ret
if I have seen pops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or just return -1 if i see int 3
and treat that as not-hijackable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo comments. Thank you!
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
/azp run runtime-extra-platforms |
Azure Pipelines successfully started running 1 pipeline(s). |
quic failures are #72429 |
Thanks!!! |
Implements return address hijacking and asynchronous suspension at safe points on Unix.
Contributes to: #67805