feat: identify and expose when connections are being closed or crashing constantly #101

viniarck · 2023-02-15T18:10:45Z

Problem:

Network operators who are deploying Kytos-ng in production and using of_core need to be able to identify (and hook it on external healthcheck mechanisms) when OpenFlow connections aren't getting stable either because of packets/handshake or a generalized crashes. Our python runtime shouldn't not struggle handling connections as long as it's a reasonable value, if it is, then of_core should expose that this is happening (maybe through and endpoint) just so this can be used externally to spun up and switchover to a different kytosd instance, this can help for recoverable errors.

Other than that, outside of code related implementation, network operators should also have alerts for how many errors or tracebacks have happened overtime, we can have this readily available on ES with Kibana, although alerts are premium ES feature, but the data is there, so a script could also poll or query that:

cc'ing @italovalcy for his info

This issue still needs further discution, but overall that's the problem we need to solve.

The text was updated successfully, but these errors were encountered:

italovalcy · 2023-02-16T17:19:31Z

I agree, @viniarck. This feature can be part of a watchdog Napp or something like this, which consolidates all validations (not only of_core) and translates into an operational status (which could indicate success, failure, or partial failure - includingg failure in non-critical components, so on)

viniarck added enhancement New feature or request future_release Planned for the next release labels Feb 15, 2023

viniarck changed the title ~~feat: Identify when connections are being closed or crashing constantly~~ feat: identify when connections are being closed or crashing constantly Feb 15, 2023

viniarck changed the title ~~feat: identify when connections are being closed or crashing constantly~~ feat: identify and expose when connections are being closed or crashing constantly Feb 15, 2023

viniarck removed the future_release Planned for the next release label Feb 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: identify and expose when connections are being closed or crashing constantly #101

feat: identify and expose when connections are being closed or crashing constantly #101

viniarck commented Feb 15, 2023 •

edited

Loading

italovalcy commented Feb 16, 2023

feat: identify and expose when connections are being closed or crashing constantly #101

feat: identify and expose when connections are being closed or crashing constantly #101

Comments

viniarck commented Feb 15, 2023 • edited Loading

italovalcy commented Feb 16, 2023

viniarck commented Feb 15, 2023 •

edited

Loading