-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segfault: buffer.View possibly released twice resulting in nil chunk #10696
Comments
cc: @manninglucas |
Looks like a use-after-free to me. This could be coming from inside netstack or from your application which uses reference counted parts of the netstack API. Looking through the netstack code nothing sticks out to me. UDP isn't too complex and that part of the API is fuzzed by syzkaller so I would surprised if there was something obviously broken there. Do you know which net protocol this connection was using (ipv4 vs ipv6)? Also, your "steps to reproduce" doesn't give me enough information on how to build and reproduce this issue myself. Adding some detail there will make it easier for me to help you debug this issue. |
Also, how frequently is this crash happening? Is it occasional or every time you try to run this code? |
Quite possible. The two places we do use the netstack's refcounting APIs comes from code adopted from netstack's
UDP IPv4 (it looks like a QUIC connection requested by udp: b85e6888d0a12280 (proxy? Exit) 192.168.0.144:40058 -> 157.240.23.128:443 for uid 10268
Apologies. What our Android app does:
The nil ptr is hit by gonet.UDPConn.Read() called by (upload)
Rare. Around once a week (uptime). Like you point out, this crash could totally be due to our app's incorrect use of ref-counting APIs. |
Thanks for the extra info. I wasn't able to determine the root cause after looking through both the gVisor UDP code and your code for some time. I will be AFK until next week, but will look at it again when I get back. In the meantime, @kevinGC could you take a look and see if you can find any ref counting issue here? |
Took a look and, while I don't have anything definitive, I wonder whether it could be related to the use of goroutines that starts in
I think that the first goroutine spawned here now has a pointer to a Perhaps try |
Thank you.
Looks like IncRef was introduced in ~Feb 2023 to resolve #8448 (comment) I couldn't find a way to DecRef ForwardRequest's PacketBuffer since it isn't exported.
iirc, none of the other FOSS projects (ex1, ex2) we looked at (at the time) DecRef'd in TCP/UDP Forwarders. gVisor's tests don't either: gvisor/pkg/tcpip/adapters/gonet/gonet_test.go Lines 403 to 424 in e89e736
Could DecRef be defer'd in ForwarderRequest.CreateEndpoint instead (udp.Forwarder provides no other way to process the handled PacketBuffer anyway other than ForwarderRequest.CreateEndpoint)? |
Looking now, I may have been wrong in #8458. It should probably be up to the caller of In any case, a leak isn't causing the panic. |
@ignoramous #10958 may fix this issue. Let me know if you see any improvement once it's merged. |
Description
buffer.View.chunk
is nil'd inbuffer.View.Release()
:gvisor/pkg/buffer/view.go
Line 108 in 8db16e8
And then transport.udp.endpoint.go possibly
DecRef()
s an already released buffer:gvisor/pkg/tcpip/transport/udp/endpoint.go
Line 233 in 8db16e8
One possibility is, the buffer was racy released bytransport.udp.Close()
:rcvMu
is held, and so this edge case is unlikely.This crash was reported by our Android app (cgo): celzero/firestack#74
I don't understand this code well enough to propose a fix (at a loss how pkg
ref_template
even works; for ex, where/howchunkRefs
defined or init'd?):gvisor/pkg/buffer/chunk.go
Line 77 in 8db16e8
Steps to reproduce
with
io.Copy(dst, gonet.UDPConn)
runsc version
docker version (if using docker)
No response
uname
No response
kubectl (if using Kubernetes)
No response
repo state (if built from source)
No response
runsc debug logs (if available)
No response
The text was updated successfully, but these errors were encountered: