Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docs for custom runtimes and registry rewrites #348

Merged
merged 2 commits into from
Oct 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 21 additions & 15 deletions docs/advanced.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,28 +151,34 @@ You can extend the K3s base template instead of copy-pasting the complete stock
BinaryName = "/usr/bin/custom-container-runtime"

```
## NVIDIA Container Runtime Support
## Alternative Container Runtime Support

K3s will automatically detect and configure the NVIDIA container runtime if it is present when K3s starts.
K3s will automatically detect alternative container runtimes if they are present when K3s starts. Supported container runtimes are:
```
crun, lunatic, nvidia, nvidia-cdi, nvidia-experimental, slight, spin, wasmedge, wasmer, wasmtime, wws
```

NVIDIA GPUs require installation of the NVIDIA Container Runtime in order to schedule and run accelerated workloads in Pods. To use NVIDIA GPUs with K3s, perform the following steps:

1. Install the nvidia-container package repository on the node by following the instructions at:
1. Install the nvidia-container package repository on the node by following the instructions at:
https://nvidia.github.io/libnvidia-container/
1. Install the nvidia container runtime packages. For example:
1. Install the nvidia container runtime packages. For example:
`apt install -y nvidia-container-runtime cuda-drivers-fabricmanager-515 nvidia-headless-515-server`
1. Install K3s, or restart it if already installed:
`curl -ksL get.k3s.io | sh -`
1. Confirm that the nvidia container runtime has been found by k3s:
1. [Install K3s](./installation), or restart it if already installed.
1. Confirm that the nvidia container runtime has been found by k3s:
`grep nvidia /var/lib/rancher/k3s/agent/etc/containerd/config.toml`

This will automatically add `nvidia` and/or `nvidia-experimental` runtimes to the containerd configuration, depending on what runtime executables are found.
You must still add a RuntimeClass definition to your cluster, and deploy Pods that explicitly request the appropriate runtime by setting `runtimeClassName: nvidia` in the Pod spec:
If these steps are followed properly, K3s will automatically add NVIDIA runtimes to the containerd configuration, depending on what runtime executables are found.

:::info Version Gate
The `--default-runtime` flag and built-in RuntimeClass resources are available as of the December 2023 releases: v1.29.0+k3s1, v1.28.5+k3s1, v1.27.9+k3s1, v1.26.12+k3s1
Prior to these releases, you must deploy your own RuntimeClass resources for any runtimes you want to reference in Pod specs.
:::

K3s includes Kubernetes RuntimeClass definitions for all supported alternative runtimes. You can select one of these to replace `runc` as the default runtime on a node by setting the `--default-runtime` value via the k3s CLI or config file.

If you have not changed the default runtime on your GPU nodes, you must explicitly request the NVIDIA runtime by setting `runtimeClassName: nvidia` in the Pod spec:
```yaml
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: nvidia
handler: nvidia
---
apiVersion: v1
kind: Pod
metadata:
Expand Down
31 changes: 27 additions & 4 deletions docs/installation/private-registry.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ please ensure you also create the `registries.yaml` file on each server as well.

Containerd has an implicit "default endpoint" for all registries.
The default endpoint is always tried as a last resort, even if there are other endpoints listed for that registry in `registries.yaml`.
Rewrites are not applied to pulls against the default endpoint.
For example, when pulling `registry.example.com:5000/rancher/mirrored-pause:3.6`, containerd will use a default endpoint of `https://registry.example.com:5000/v2`.
* The default endpoint for `docker.io` is `https://index.docker.io/v2`.
* The default endpoint for all other registries is `https://<REGISTRY>/v2`, where `<REGISTRY>` is the registry hostname and optional port.
Expand Down Expand Up @@ -89,12 +90,13 @@ Then pulling `docker.io/rancher/mirrored-pause:3.6` will transparently pull the

#### Rewrites

Each mirror can have a set of rewrites. Rewrites can change the name of an image based on regular expressions.
Each mirror can have a set of rewrites, which use regular expressions to match and transform the name of an image when it is pulled from a mirror.
This is useful if the organization/project structure in the private registry is different than the registry it is mirroring.
Rewrites match and transform only the image name, NOT the tag.

For example, the following configuration would transparently pull the image `docker.io/rancher/mirrored-pause:3.6` as `registry.example.com:5000/mirrorproject/rancher-images/mirrored-pause:3.6`:

```
```yaml
mirrors:
docker.io:
endpoint:
Expand All @@ -103,8 +105,29 @@ mirrors:
"^rancher/(.*)": "mirrorproject/rancher-images/$1"
```

When using redirects and rewrites, images will still be stored under the original name.
For example, `crictl image ls` will show `docker.io/rancher/mirrored-pause:3.6` as available on the node, even though the image was pulled from the mirrored registry with a different name.
:::info Version Gate
Rewrites are no longer applied to the [Default Endpoint](#default-endpoint-fallback) as of the January 2024 releases: v1.26.13+k3s1, v1.27.10+k3s1, v1.28.6+k3s1, v1.29.1+k3s1
Prior to these releases, rewrites were also applied to the default endpoint, which would prevent K3s from pulling from the upstream registry if the image could not be pulled from a mirror endpoint, and the image was not available under the modified name in the upstream.
::::

If you want to apply rewrites when pulling directly from a registry - when it is not being used as a mirror for a different upstream registry - you must provide a mirror endpoint that does not match the default endpoint.
Mirror endpoints in `registries.yaml` that match the default endpoint are ignored; the default endpoint is always tried last with no rewrites, if fallback has not been disabled.

For example, if you have a registry at `https://registry.example.com/`, and want to apply rewrites when explicitly pulling `registry.example.com/rancher/mirrored-pause:3.6`, you can add a mirror endpoint with the port listed.
Because the mirror endpoint does not match the default endpoint - **`"https://registry.example.com:443/v2" != "https://registry.example.com/v2"`** - the endpoint is accepted as a mirror and rewrites are applied, despite it being effectively the same as the default.

```yaml
mirrors:
registry.example.com
endpoint:
- "https://registry.example.com:443"
rewrites:
"^rancher/(.*)": "mirrorproject/rancher-images/$1"
```


Note that when using mirrors and rewrites, images will still be stored under the original name.
For example, `crictl image ls` will show `docker.io/rancher/mirrored-pause:3.6` as available on the node, even if the image was pulled from a mirror with a different name.

### Configs

Expand Down
Loading