Set worker_processes to a static number #140

dkeightley · 2021-11-08T22:51:16Z

On clusters with large nodes the auto value for worker_processes can scale too high, with the number of threads (32 per worker) this can consume many file handles.

Determined 4 as a reasonable default for the use case.

Related: rancher/rancher#27693

superseb

The linked issue describes: "allow user to configure", and the PR is setting a default for everyone. Why was chosen to change the default and not make it configurable for the exception (which apparently is a node with a lot of (v)CPUs)? Why was 4 determined as the reasonable default? Why won't this affect upgraded clusters that move from auto to 4?

dkeightley · 2021-11-29T21:48:20Z

Hi @superseb, the issue is related, it's not a direct fix, however the reason I could see to set a default:

It's a proxy deployed by RKE that users are largely unaware of
It shouldn't need to be tuned like ingress-nginx for example
It has a relatively consistent workload profile
It is a starting point, can be configurable in future

nginx is capable of many requests with a small number of workers when acting as a reverse proxy, for the purpose of kubelet -> kube-apiserver of a single node 4 was determined as adequate. Totally open to input here, the intention is not to choose a particular value but to avoid nginx-proxy inadvertently consuming large amounts of PIDs, file handles etc. without a way to avoid it.

superseb · 2022-07-19T12:29:44Z

@dkeightley Given the issue (which has not seen any activity so far), it is about limiting nginx-proxy worker_processes so it doesn't configure 100 processes when 100 (v)CPUs/cores are found. If we set it to 4, what are the performance implications on 2 and 4 core machines?

superseb

See ^^. And rebase and push so build can complete.

dkeightley · 2022-07-20T04:44:11Z

Thanks @superseb, performance changes should be minimal with a small number of process increase, compared to nodes with high core counts.

If you think it's a better fit for nginx-proxy, worker_processes could be set lower (1-2) as well, control plane nodes are typically 1-3 nodes, connectivity is used primarily by the kubelet, so use case is relatively static and could be handled by a single worker.

superseb · 2022-07-20T13:04:50Z

I guess you are right but its very simple to test. Is this change tested or is this based on assumption/guessing?

The only downside to setting it statically is that we are changing it for all installs at once with no way to change it back to the old behavior.

Please rebase instead of adding a merge commit. (or as it is now, squash commits)

dkeightley · 2022-08-17T04:02:38Z

Testing performed using derekdemo/rke-tools:worker_processes, using image from commit 8816bdd.

...
Successfully tagged rancher/rke-tools:8816bdd

Tested on a node with 2 CPUs:

# cat /proc/cpuinfo | grep processor | wc -l
2

Worker node with default rke-tools image:

# docker ps | grep nginx-proxy
ca6a69f06d84   rancher/rke-tools:v0.1.80             "nginx-proxy CP_HOST…"   30 minutes ago   Up 30 minutes             nginx-proxy
# docker exec nginx-proxy ps aux | grep worker
   12 nginx     0:00 nginx: worker process
   13 nginx     0:00 nginx: worker process
# docker exec nginx-proxy grep worker_processes /etc/nginx/nginx.conf
worker_processes auto;

Worker node with updated image:

# docker ps | grep nginx-proxy
9d8ebc974a85   derekdemo/rke-tools:worker_processes   "nginx-proxy CP_HOST…"   20 seconds ago   Up 18 seconds             nginx-proxy
# docker exec nginx-proxy ps aux | grep worker
   13 nginx     0:00 nginx: worker process
   14 nginx     0:00 nginx: worker process
   15 nginx     0:00 nginx: worker process
   16 nginx     0:00 nginx: worker process
# docker exec nginx-proxy grep worker_processes /etc/nginx/nginx.conf
worker_processes 4;

No issues observed from the kubelet logs with the new container image, node was active/Ready from the Kubernetes perspective.

superseb

This will at least need some basic load testing, but I'm going to assume that the downside (having too many workers) are worse than the upside.

superseb

I'm fine with the change as it seems to resolve an issue for support, I don't think the testing was sufficient but if QA is going to test it in bigger clusters/with more load, it should be okay.

a-blender · 2023-02-01T19:12:01Z

Good to merge with

Test template added to the issue with what's already been tested
@sowmyav27 QA should do basic load balancer testing and tests with a bigger cluster/larger workload to see the performance implications of worker_processes set to 4.

dkeightley · 2023-02-02T02:04:26Z

Thanks @annablender, the testing so far has been using a drop in replacement of the proposed rke-tools container to confirm:

Once the system_images.nginx_proxy is overridden, confirm the updated (in this PR) container image is running
The nginx.conf has the updated worker_processes
No adverse effects on the worker nodes in the cluster

A load test would be worthwhile, however it's expected that the value for worker_processes would never have been a performance bottleneck given the small use case and node sizes often creating around 4 workers (auto detects and sets 1 worker per CPU).

Just for clarity, rke-tools used in this scenario is used only to run the nginx-proxy container, this provides a reverse proxy (nginx) on each worker node [1] listening on 127.0.0.1:6443 to load balance requests back to the control plane nodes (proxy_pass).

              worker node              ||       control plane node
kubelet -> nginx-proxy (proxy_pass)   ----->    kube-apiserver

The change only effects the kubelet connectivity to the kube-apiserver.

Setting worker_processes to 4 would exceed the typical number of control plane nodes (2-3). Even with a spare worker, each worker can process many requests, in this case 1024 simultaneous connections per worker:

# docker exec -it nginx-proxy grep worker /etc/nginx/nginx.conf
worker_processes 4;
  worker_connections 1024;

[1] Nodes that have controlplane and worker roles, or all roles, don't run an nginx-proxy container, therefore the kubelet connects on 127.0.0.1:6443 directly to the kube-apiserver container (binds to 6443/TCP) of the node it resides.

dkeightley · 2023-03-15T02:18:11Z

Anything else we need to do to move forward with QA?

dkeightley requested a review from cbron November 29, 2021 03:33

cbron requested review from superseb and kinarashah and removed request for cbron November 29, 2021 15:04

superseb suggested changes Nov 29, 2021

View reviewed changes

dkeightley requested a review from superseb July 19, 2022 00:21

superseb suggested changes Jul 19, 2022

View reviewed changes

Set worker_processes to a static number

8816bdd

dkeightley force-pushed the worker_processes branch from 7931858 to 8816bdd Compare July 21, 2022 05:12

dkeightley requested a review from superseb August 25, 2022 22:26

superseb reviewed Aug 30, 2022

View reviewed changes

superseb requested review from a team and removed request for kinarashah August 30, 2022 09:59

jiaqiluo requested a review from superseb October 11, 2022 17:43

superseb approved these changes Oct 12, 2022

View reviewed changes

a-blender approved these changes Oct 12, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set worker_processes to a static number #140

Set worker_processes to a static number #140

dkeightley commented Nov 8, 2021

superseb left a comment

dkeightley commented Nov 29, 2021

superseb commented Jul 19, 2022

superseb left a comment

dkeightley commented Jul 20, 2022

superseb commented Jul 20, 2022

dkeightley commented Aug 17, 2022

superseb left a comment

superseb left a comment

a-blender commented Feb 1, 2023

dkeightley commented Feb 2, 2023

dkeightley commented Mar 15, 2023

Set worker_processes to a static number #140

Are you sure you want to change the base?

Set worker_processes to a static number #140

Conversation

dkeightley commented Nov 8, 2021

superseb left a comment

Choose a reason for hiding this comment

dkeightley commented Nov 29, 2021

superseb commented Jul 19, 2022

superseb left a comment

Choose a reason for hiding this comment

dkeightley commented Jul 20, 2022

superseb commented Jul 20, 2022

dkeightley commented Aug 17, 2022

superseb left a comment

Choose a reason for hiding this comment

superseb left a comment

Choose a reason for hiding this comment

a-blender commented Feb 1, 2023

dkeightley commented Feb 2, 2023

dkeightley commented Mar 15, 2023