-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hop client connections on unreliable networks #140
Comments
We are discussing this today/tomorrow (apologies for late reply!) |
This certainly seems desirable, yeah. I think that you might be able to achieve what you want today with something like this: import time
import random
import adc.errors
from hop import Stream
def consume_stream_with_reconnections(stream_addr, handler, max_errors=5):
error_count = 0
while True:
try:
stream = Stream()
with stream.open(stream_addr, "r") as s:
for message in s:
handler(message)
error_count = 0
return
except adc.errors.KafkaException as e:
error_count += 1
if error_count >= max_errors:
raise(e)
# sleep with exponential backoff and a bit of jitter.
time.sleep((1.5**error_count) * (1 + random.random())/2) You'd need to make some adjustments for auth, of course, but I think the basic idea is there. This is kind of tricky stuff, so it would be nice to find a way to encapsulate it. Can you help me brainstorm on what a good API for this might be, @mlinvill? It might be appealing to say that this should all happen magically inside One downside of the code I have written here is that it will do a full reconnection, including setting up topic subscriptions, on every error. That may be overkill for some really little transient errors which don't require tearing down and reconnecting. In fact, all those reconnections may impose extra load on the Kafka brokers. I'm not sure how to quantify that problem, and my gut is that it's fine to ignore it for now. |
If you install except adc.errors.KafkaException as e:
if not e.retriable:
raise(e)
[... etc ...] |
I'm happy to help with this in any way I am able. It's probably obvious as I was submitting the issue that I have little knowledge of the hop-client code. I had envisioned perhaps an argument to the Stream instance (or open) to trigger this behavior of trying to reconnect (on network errors) for some sane length of time. Doing this in the iterator from the client perspective is pretty dreamy. I hadn't appreciated the kafka (re)subscribing overhead, however. Perhaps it's most sane to implement this feature in the ADC Consumer and Producer objects? I didn't write the hop-SNalert-app client, so I'll spend some more time in the code and perhaps implement your suggestion as a test case to inform this discussion. |
After some investigation, @spenczar's
I think it's quite possible to encapsulate this feature into |
Sorry to ping such an old issue! @cnweaver has there been any progress on implementing this feature into I'm looking to implement this in our production code and thought I should check in before doing so. |
Description
I have noticed that if the network goes away while the hop client is connected to the server, no attempt is made to reconnect. Being able to reconnect upon disconnect/timeout might be a very nice configurable feature for production services which use the hop client. Even throwing a specific named exception might be good enough for client code to respond appropriately and attempt to reconnect.
Example
Here is a representative debug/stack trace from a very old client version:
The text was updated successfully, but these errors were encountered: