-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NativeAOT] Some cleanup of assembly helpers in hijack area. #72542
Conversation
@@ -320,7 +320,7 @@ C_FUNC(\Name): | |||
DEFAULT_FRAME_SAVE_FLAGS = PTFF_SAVE_ALL_PRESERVED + PTFF_SAVE_RSP | |||
|
|||
.macro PUSH_COOP_PINVOKE_FRAME trashReg | |||
push_nonvol_reg rbp // push RBP frame // TODO: do we need this? not on windows. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have checked and Windows debuggers do not have issues with unwinding without introducing rbp frame, so I did not change anything.
@@ -1413,7 +1352,7 @@ COOP_PINVOKE_HELPER(void, RhpReversePInvoke, (ReversePInvokeFrame * pFrame)) | |||
if (pCurThread->InlineTryFastReversePInvoke(pFrame)) | |||
return; | |||
|
|||
RhpReversePInvokeAttachOrTrapThread2(pFrame); | |||
pCurThread->ReversePInvokeAttachOrTrapThread(pFrame); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the C++ code generated for this going to be as good as before? It looks like this was carefully setup to allow the slow path to be tailcalled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The slow path would be very rare. It handles a case when a thread has not seen managed code yet.
I thought that the helper was added to call from assembly and now we don't
I will check if we are still tailcalling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c++ actually inlines RhpReversePInvokeAttachOrTrapThread2
after the change.
The impact on the fast part of the method is not too bad, but keeping the slow path separate still has some minor benefits like pushing fewer registers in prolog.
I will revert this part.
for (size_t i = 0; i < ARRAY_SIZE(NormalHijackTargets); i++) | ||
{ | ||
if (NormalHijackTargets[i] == address) | ||
if (&RhpGcProbeHijack == address) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation looks off.
;; Build a map of symbols representing offsets into the transition frame (see PInvokeTransitionFrame in | ||
;; rhbinder.h) and keep these two in sync. | ||
map 0 | ||
field OFFSETOF__PInvokeTransitionFrame__m_PreservedRegs | ||
field 10 * 8 ; x19..x28 | ||
m_CallersSP field 8 ; SP at routine entry | ||
field 19 * 8 ; x0..x18 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no point in saving/restoring volatile registers as at return site they contain garbage.
lr,fp are saved as a part of stack alloc/dealloc, so no need to save lr again.
NZCV was not actually stored and there is no need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do need to save returns though as they may have live data.
jnz LOCAL_LABEL(DoRhpGcProbe) | ||
ret | ||
|
||
LOCAL_LABEL(DoRhpGcProbe): | ||
or ecx, DEFAULT_FRAME_SAVE_FLAGS + PTFF_SAVE_RAX + PTFF_SAVE_RDX | ||
jmp C_FUNC(RhpGcProbe) | ||
NESTED_END RhpGcProbeHijack, _TEXT | ||
|
||
NESTED_ENTRY RhpGcProbe, _TEXT, NoHandler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RhpGcProbe
does not match what the funciton does now that RhpTrapThreads
check was removed from it. Should it be called RhpWaitForGC
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RhpGcProbe was not the best name.
The actual purpose of the function is to set up the stackwalk-enabling frame around the RhpWaitForGC2
, which does the actual waiting. It is hard to capture that in couple words.
RhpWaitForGC
seems a bit better. The function does typically result in waiting for GC.
@@ -326,8 +286,8 @@ __PPF_ThreadReg SETS "r2" | |||
NESTED_ENTRY RhpGcProbeRare |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete RhpGcProbe
and rename RhpGcProbeRare
to RhpGcProbe
to match other platforms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not put much effort in keeping 32bit platforms in sync. I only did some easy opportunistic changes.
I think we may be iterating on 64bit implementation a bit more and keeping 32bit behind is not a big deal. When/if we get to 32bit, absorbing all the changes should not be hard. It might even be that porting 64bit code, that works and tested, would be a shorter path than adjusting existing 32bit code. Probably it will be a mix of both approaches.
I will check if these changes are easy to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have fixed the entry points so that all platforms look the same - one RhpGcProbeHijack
that checks for the trap flag, and if set, sets bits for saved registers and calls RhpWaitForGC
. It might even work when we get to 32bit.
@@ -198,7 +162,8 @@ RhpGcProbe proc | |||
SynchronousRendezVous: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete RhpTrapThreads check to match other platforms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo comments. Thank you!
(as much as can be done without trying to compile and test)
/azp run runtime-extra-platforms |
Azure Pipelines successfully started running 1 pipeline(s). |
Thanks!! |
Mostly removing stuff to simplify porting to unix-arm64
(re:[NativeAOT] thread suspension on Unix #71187 (comment))
RhpGcProbeHijack
to be similar between win-x64, unix-x64, win-arm64.Especially the win-arm64 implementation that had many differences.
There should be no new functionality or any behavior changes here. Just deleting/refactoring code.