RKE2 cluster high disk IO and causing disk thrashing #4747
Replies: 3 comments 3 replies
-
That is not what's happening. What you're seeing is a normal level of constant writing by etcd. Even without a workload deployed, core Kubernetes components are constantly health-checking and renewing leases. Each of these writes needs to be syncd to disk by etcd. It is normally recommended that you deploy etcd on ssd, and if you are running multiple VMs on a single host, avoid using the same backing disk for all the nodes. If you are trying to reduce overhead to the point where just running etcd is too much to bear, you might look at using a single-server k3s cluster with sqlite? |
Beta Was this translation helpful? Give feedback.
-
I also was surprised what I have on almost empty cluster Harvester + Small RKE2 cluster 3.3 mb/s write. I'm new to K8s and for me it was a shock that it can kill ssd just in 2 - 3 years just for running Kubernetes and if we add workload by other apps, it will be much less |
Beta Was this translation helpful? Give feedback.
-
what is you snapshot count ?10,000? |
Beta Was this translation helpful? Give feedback.
-
Hello Everyone,
Maybe someone had this issue. Rke2 is constantly saving the state of the cluster under /var/lib/rancher/rke2/server/db/etcd/config
And this is causing disk thrashing in my underlying disks. Is there anyway to reduce how often the rke2 writes to the disk or the logging level of the whole cluster? For now i only have the cluster runniing and nothing more.RKE2 keeps writing to the disks in different time intervals. I would appreciate it if someoen could help me with tuning this. I have put pictures of disk io activity from the virtual machine where my rke2 cluster is running and also from the the underlying TrueNAS Server. AS for disks i am using underlying hdd disks
Thx in advance
Beta Was this translation helpful? Give feedback.
All reactions