Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix cannot reach chhost bug #121

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

sunny19930321
Copy link

@sunny19930321 sunny19930321 commented Jun 11, 2021

1、print http transport error log information
2、close chproxy KeepAlive. The CH socket is closed over time, but the CHProxy connection is reused, causing the connection to reset by peer
fix this problem:#92

@sunny19930321 sunny19930321 changed the title fix: fix cannot reach chhost bug fix cannot reach chhost bug Jun 11, 2021
proxy.go Outdated
Comment on lines 52 to 65
Transport: &http.Transport {
DisableKeepAlives: true,
Proxy: http.ProxyFromEnvironment,
DialContext: (&net.Dialer{
Timeout: 30 * time.Second,
KeepAlive: 30 * time.Second,
DualStack: true,
}).DialContext,
ForceAttemptHTTP2: true,
MaxIdleConns: 100,
IdleConnTimeout: 90 * time.Second,
TLSHandshakeTimeout: 10 * time.Second,
ExpectContinueTimeout: 1 * time.Second,
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for your PR.

First, how to properly check that there is no regression in chproxy bringing this feature ? Do. you see any way to properly test that ?

Also, I have the feeling that we won't have the annoying error message in the source application but way more connections made over time, one query generating one connection. This could be very impactful on heavy loaded systems.

Couldn't we achieve the same behavior with a custom retry implemented ?

Copy link
Author

@sunny19930321 sunny19930321 Jun 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your advice,this test is not very easy to test, because this problem is not easy to repeat. We tested it through pressure test.We also tried to use retry, but we couldn't guarantee idempotency.Proxy requests one connection for each connection, observe that the system is not loaded high, we can also configure to close this KeepAlive

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

observe that the system is not loaded high is not a fact but subject to appreciation.

I'am not in favor of adding to CHProxy something that fits you, but could potentially impair a majority.

For the record, One of my production cluster is actually running between 50 an 100 QPS. The keep alive thing might have a disastrous effect on both microservices / chproxy and clickhouse.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,I first submit the cannot reach chhost error log PR, which provides details about the connection CH exception, and then I will see if I can choose to close KeepAlive by adding configuration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants