Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mitigating a FaultException in Mechanism.Encode. #1095

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sonicpro
Copy link

@sonicpro sonicpro commented Jun 7, 2024

On some circumstances Msg class throws NetMQ.FaultException in Mechanism.Encode method. The details are in #1094

closes #1094

@follesoe
Copy link

Any chance you could have a look at this PR, @drewnoakes? I know I am not entitled to nag or demand from open-source maintainers, but it would be great to get some updates on the NetMQ core. Right now, I am maintaining my own fork and building from source rather than using the NuGet package to get some of these fixes included.

@drewnoakes
Copy link
Member

While this does look like it'd suppress the exception, I'm not sure this is an actual fix. It might just allow the process to continue having skipped data or in some other invalid state, whereafter debugging the problem would probably be harder.

From the linked issue:

I think on some circumstances the code in NetMQ.Core.Utils.YQueue can return a non-initialized message.

Indeed, looking at YQueue you can see it's not null-annotated, and there are a bunch of expectations around how the type is used. I wonder whether you'd be able to run a version of NetMQ with an implementation of YQueue that validates all nullness guarantees, to see if that's what's really going on. A process dump when the exception is thrown should provide insight into what state the application was in when the failure occurred.

https://stackoverflow.com/a/20238046/24874

A dump can be opened in Visual Studio to analyze the state of the process at the time of the crash. The instance of YQueue, YPipe and so on can be inspected to check if it's in a bad state.

I'm sympathetic to the problem here and want to find a solution that addresses the problem fully. It's just that the problem isn't well understood unfortunately.

@sonicpro
Copy link
Author

While this does look like it'd suppress the exception, I'm not sure this is an actual fix. It might just allow the process to continue having skipped data or in some other invalid state, whereafter debugging the problem would probably be harder.

From the linked issue:

I think on some circumstances the code in NetMQ.Core.Utils.YQueue can return a non-initialized message.

Indeed, looking at YQueue you can see it's not null-annotated, and there are a bunch of expectations around how the type is used. I wonder whether you'd be able to run a version of NetMQ with an implementation of YQueue that validates all nullness guarantees, to see if that's what's really going on. A process dump when the exception is thrown should provide insight into what state the application was in when the failure occurred.

https://stackoverflow.com/a/20238046/24874

A dump can be opened in Visual Studio to analyze the state of the process at the time of the crash. The instance of YQueue, YPipe and so on can be inspected to check if it's in a bad state.

I'm sympathetic to the problem here and want to find a solution that addresses the problem fully. It's just that the problem isn't well understood unfortunately.

Hello @drewnoakes I am going to collect the dump data on the program crash. But it is a windows service. Will the method with Windows Error Reporting by the link you posted https://stackoverflow.com/questions/20237201/best-way-to-have-crash-dumps-generated-when-processes-crash/20238046#20238046 work? I mean it is said in the post that Windows will collect the data only after you click "Close the program" in an error message that reports the program crash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Process crash due to an unhandled exception in Mechanism.Encode
3 participants