Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Black screen and broken pipe (os error 32) after few hours of logging #5143

Closed
5xtm opened this issue Feb 8, 2024 · 5 comments
Closed

Black screen and broken pipe (os error 32) after few hours of logging #5143

5xtm opened this issue Feb 8, 2024 · 5 comments
Labels
🪳 bug Something isn't working 👀 needs triage This issue needs to be triaged by the Rerun team

Comments

@5xtm
Copy link

5xtm commented Feb 8, 2024

Describe the bug
I have deployed rerun in a container with Podman. The container is doing some inference every 5 seconds over 8 cameras, and display some bounding boxes. After a few hours (3-4 hours), if I open the web viewer or a native webviewer, the viewer remains black even if I am connected to the websocket address. The container is not deployed on my machine, but on a remote server with limited bandwidth (max 30 Mb/s)

To Reproduce
Steps to reproduce the behavior:

  1. Deploy and run the container with Rerun and Python. Use
rr.init(RERUN_RECORDING_ID, spawn=False)
rr.serve(open_browser=False, server_memory_limit="100MB")
while True:
    for i in cameras_ip:
        image = get_image(i)
        rr.log(camera_ip, rr.Image(image).compress(jpeg_quality=20))
  1. Wait a few hours (3-4 hours)
  2. Open the webviewer or launch rerun with rerun ws://... on a separate computer (personal in this case)

Expected behavior
The viewer should display the logs, and not stay empty
Screenshots
Screenshot 2024-02-08 at 15 38 33

Backtrace
Logs in the container :

$ podman logs 6d46456d980f | grep error
[2024-02-08T10:57:42Z INFO  re_ws_comms::server] Listening for WebSocket traffic on ws://localhost:9877. Connect with a Rerun Web Viewer.
[2024-02-08T10:57:42Z INFO  re_web_viewer_server] Started web server on http://localhost:9090
[2024-02-08T10:57:42Z INFO  re_sdk::web_viewer] Web server is running - view it at http://localhost:9090?url=ws://localhost:9877
[2024-02-08T11:11:46Z INFO  re_ws_comms::server] Memory limit (95.4 MiB) exceeded. Dropping old log messages from the server. Clients connecting after this will not see the full history.
[2024-02-08T13:59:13Z ERROR re_ws_comms::server] Error processing connection: IO error: Broken pipe (os error 32)
[2024-02-08T14:03:01Z ERROR re_ws_comms::server] Error processing connection: IO error: Broken pipe (os error 32)
$ rerun ws://XXXX:9877
[2024-02-08T14:13:02Z INFO  re_data_source::web_sockets] Connecting to WebSocket server at "ws://XXXX:9877"…
[2024-02-08T14:13:02Z INFO  re_ws_comms::client] Connecting to "ws://XXXX:9877"…
[2024-02-08T14:13:02Z INFO  re_ws_comms::client] Connection established

Desktop (please complete the following information):
Container :

  • Linux user 4.18.0-477.15.1.el8_8.x86_64 SMP Fri Jun 2 08:27:19 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux
    Local :
    Darwin user 22.5.0 Darwin Kernel Version 22.5.0: Thu Jun 8 22:22:20 PDT 2023; root:xnu-8796.121.3~7/RELEASE_ARM64_T6000 arm64
    Rerun version
    Container :
    rerun-sdk==0.12.1
    Local :
    rerun_py 0.12.0 [rustc 1.74.0 (79e9716c9 2023-11-13), LLVM 17.0.4] aarch64-apple-darwin release-0.12.0 afa112e, built 2024-01-09T17:57:15Z
    Additional infos:
    At the beginning, everything works fine.
@5xtm 5xtm added 👀 needs triage This issue needs to be triaged by the Rerun team 🪳 bug Something isn't working labels Feb 8, 2024
@5xtm
Copy link
Author

5xtm commented Feb 9, 2024

Kind of similar to this issue #1329

@Wumpf
Copy link
Member

Wumpf commented Apr 23, 2024

apologies looks like this feel through the cracks!
@5xtm in what way do you think would explicit removal of data from the store help with this?

Not having a good idea on how to check on the broken pipe so far. Does this happen regularly to you and are you aware of some more minimal repro?

@Wumpf
Copy link
Member

Wumpf commented Apr 23, 2024

@Wumpf
Copy link
Member

Wumpf commented Oct 11, 2024

@5xtm did you try with a newer version of the viewer meanwhile?

@Wumpf
Copy link
Member

Wumpf commented Oct 22, 2024

we fixed a bunch of memory issues since this was reported. I'm closing this since we don't have an actionable repro-case right now. Please re-open if needed!

@Wumpf Wumpf closed this as not planned Won't fix, can't repro, duplicate, stale Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪳 bug Something isn't working 👀 needs triage This issue needs to be triaged by the Rerun team
Projects
None yet
Development

No branches or pull requests

2 participants