Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not finish execution when gRPC stream closed #89

Open
sfairat15 opened this issue Apr 27, 2023 · 13 comments
Open

Do not finish execution when gRPC stream closed #89

sfairat15 opened this issue Apr 27, 2023 · 13 comments
Labels
help wanted Extra attention is needed kind/feature New feature or request lifecycle/rotten

Comments

@sfairat15
Copy link

Motivation

We have Falco installation at out k8s cluster. When we update Falco configs (customRules, for example), then all its DaemonSet pods recreating.

But at the same time all falco-exporters containers start restarting, because of gRPC stream closed reason.

These massive container restarts raise alert "Too many container restarts" at our monitoring.

Feature

May be you can add some retry logic for grpc reconnect instead of application exit? Thanks.

Additional context

Some falco-exporter container setup:

  containers:
  - args:
    - /usr/bin/falco-exporter
    - --client-socket=unix:///run/falco/falco.sock
    - --timeout=5m
    - --listen-address=0.0.0.0:9376
    image: docker.io/falcosecurity/falco-exporter:0.8.2

Falco-exporter log:

> kubectl logs --previous falco-exporter-z64vp
2023/04/12 07:56:43 connecting to gRPC server at unix:///run/falco/falco.sock (timeout 5m0s)
2023/04/12 07:56:43 listening on http://0.0.0.0:9376/metrics
2023/04/12 07:56:46 connected to gRPC server, subscribing events stream
2023/04/12 07:56:46 ready
2023/04/27 11:23:48 gRPC stream closed
@sfairat15 sfairat15 added the kind/feature New feature or request label Apr 27, 2023
@leogr
Copy link
Member

leogr commented May 8, 2023

Hey @sfairat15

Interesting. We can let falco-exporter try to reconnect by itself (ie. without exiting) using the already implemented connection backoff mechanism. Would it be enough?

PS
I'm not sure if this can create side-effects during normal shutdown operations (likely not, I need to check) 🤔

@poiana
Copy link

poiana commented Aug 6, 2023

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@leogr
Copy link
Member

leogr commented Aug 10, 2023

/remove-lifecycle stale
/help

@poiana
Copy link

poiana commented Aug 10, 2023

@leogr:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/remove-lifecycle stale
/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@poiana poiana added help wanted Extra attention is needed and removed lifecycle/stale labels Aug 10, 2023
@poiana
Copy link

poiana commented Nov 8, 2023

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@leogr
Copy link
Member

leogr commented Nov 8, 2023

/remove-lifecycle stale

@poiana
Copy link

poiana commented Feb 6, 2024

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@leogr
Copy link
Member

leogr commented Feb 8, 2024

/remove-lifecycle stale

@poiana
Copy link

poiana commented May 8, 2024

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@poiana
Copy link

poiana commented Jun 7, 2024

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

@leogr
Copy link
Member

leogr commented Jun 11, 2024

/remove-lifecycle rotten

@poiana
Copy link

poiana commented Sep 9, 2024

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@poiana
Copy link

poiana commented Oct 9, 2024

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed kind/feature New feature or request lifecycle/rotten
Projects
None yet
Development

No branches or pull requests

3 participants