Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image automation pushes alot of commits (multi clusters) #4941

Open
1 task done
ehrnst opened this issue Aug 21, 2024 · 6 comments
Open
1 task done

Image automation pushes alot of commits (multi clusters) #4941

ehrnst opened this issue Aug 21, 2024 · 6 comments

Comments

@ehrnst
Copy link

ehrnst commented Aug 21, 2024

Describe the bug

We have setup of clusters in multiple regions. But dev and prod clusters are present in the different regions.
Application deployments should be the same in region 1 dev cluster and region 2 dev cluster. And for that reason ImageUpdateAutomation is set up.

When building new images, the commit is pushed back to the origin git(hub) repo. However, it looks like the clusters might be playing ping-pong, as every minute we get a new commit.

image

What could potentially cause this?

For this application there is one imageRepository andgitRepository in flux-system namespace. ImageUpdateAutomation sits in the application/environment namespace.

Steps to reproduce

Not sure.

Expected behavior

Only one commit with the latest image is pushed to git. Not flip/flop between latest and previous image.

Screenshots and recordings

No response

OS / Distro

Ubuntu

Flux version

2.1.2

Flux check

flux 2.1.2 <2.3.0 (new version is available, please upgrade)
✔ Kubernetes 1.29.4 >=1.25.0-0
► checking controllers
✔ fluxconfig-agent: deployment ready
► mcr.microsoft.com/azurek8sflux/fluxconfig-agent:1.11.1
► mcr.microsoft.com/azurek8sflux/fluent-bit-mdm:1.11.1
✔ fluxconfig-controller: deployment ready
► mcr.microsoft.com/azurek8sflux/fluxconfig-controller:1.11.1
► mcr.microsoft.com/azurek8sflux/fluent-bit-mdm:1.11.1
✔ helm-controller: deployment ready
► mcr.microsoft.com/oss/fluxcd/helm-controller:v1.0.1
✔ image-automation-controller: deployment ready
► mcr.microsoft.com/oss/fluxcd/image-automation-controller:v0.38.0
✔ image-reflector-controller: deployment ready
► mcr.microsoft.com/oss/fluxcd/image-reflector-controller:v0.32.0
✔ kustomize-controller: deployment ready
► mcr.microsoft.com/oss/fluxcd/kustomize-controller:v1.3.0
✔ notification-controller: deployment ready
► mcr.microsoft.com/oss/fluxcd/notification-controller:v1.3.0
✔ source-controller: deployment ready
► mcr.microsoft.com/oss/fluxcd/source-controller:v1.3.0
► checking crds
✔ alerts.notification.toolkit.fluxcd.io/v1beta3
✔ buckets.source.toolkit.fluxcd.io/v1beta2
✔ fluxconfigs.clusterconfig.azure.com/v1alpha1
✔ gitrepositories.source.toolkit.fluxcd.io/v1
✔ helmcharts.source.toolkit.fluxcd.io/v1
✔ helmreleases.helm.toolkit.fluxcd.io/v2
✔ helmrepositories.source.toolkit.fluxcd.io/v1
✔ imagepolicies.image.toolkit.fluxcd.io/v1beta2
✔ imagerepositories.image.toolkit.fluxcd.io/v1beta2
✔ imageupdateautomations.image.toolkit.fluxcd.io/v1beta2
✔ kustomizations.kustomize.toolkit.fluxcd.io/v1
✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2
✔ providers.notification.toolkit.fluxcd.io/v1beta3
✔ receivers.notification.toolkit.fluxcd.io/v1
✔ all checks passed

Git provider

GitHub

Container Registry provider

Azure Container Registry

Additional context

I know AKS (Azure) is not on latest flux for all regions

Code of Conduct

  • I agree to follow this project's Code of Conduct
@stefanprodan
Copy link
Member

You should be running the image automation for a path on a single cluster.

@ehrnst
Copy link
Author

ehrnst commented Aug 21, 2024

You should be running the image automation for a path on a single cluster.

so if i understand correctly, we cannot have one overlay for dev which is deployed to two regions?
meaning we have to create overlays per cluster and environment? IE we have one cluster where application is deployed to both dev and qa in different namespaces. structure is now

├── deployment
│   ├── base
│   │   ├── **/*.yaml
│   ├── overlays
│   │   ├── dev
│   │   │   ├── **/*.yaml
│   │   ├── staging
│   │   │   ├── **/*.yaml
│   │   ├── prod

image automation inside dev in this case looks like this

apiVersion: image.toolkit.fluxcd.io/v1beta1
kind: ImageUpdateAutomation
metadata:
  name: appname
spec:
  interval: 1m
  sourceRef:
    kind: GitRepository
    name: appname
    namespace: flux-system
  git:
    checkout:
      ref:
        branch: dev
    commit:
      messageTemplate: |
        Automated image update

        [skip ci]

        Automation name: {{ .AutomationObject }}

        Files:
        {{ range $filename, $_ := .Updated.Files -}}
        - {{ $filename }}
        {{ end -}}

        Objects:
        {{ range $resource, $_ := .Updated.Objects -}}
        - {{ $resource.Kind }} {{ $resource.Name }}
        {{ end -}}

        Images:
        {{ range .Updated.Images -}}
        - {{.}}
        {{ end -}} 
      author:
        email: fluxcdbot@users.noreply.github.com
        name: fluxcdbot
    push:
      branch: dev
  update:
    path: ./deployment/overlays
    strategy: Setters

@makkes
Copy link
Member

makkes commented Aug 21, 2024

You can't have two ImageUpdateAutomation resources running against the same path, it just doesn't make sense because they race against each other.

@ehrnst
Copy link
Author

ehrnst commented Aug 22, 2024

You can't have two ImageUpdateAutomation resources running against the same path, it just doesn't make sense because they race against each other.

I see. earlier i found this issue and i figured the statement Flux will push a single commit no matter on how many clusters it runs, the fastest cluster will push the changes, then all others will see there is nothing to commit and do nothing. was the one saving me here. but as i experienced, and you say. there is no single commit. each cluster will commit what they think its the latest. And if some clusters have not yet synced their git repository, their commits will be stale. Do i understand it correct?

@stefanprodan
Copy link
Member

so if i understand correctly, we cannot have one overlay for dev which is deployed to two regions?

Yes you can, just scale to zero the image automation controllers on all regions except for one.

@ehrnst
Copy link
Author

ehrnst commented Aug 22, 2024

so if i understand correctly, we cannot have one overlay for dev which is deployed to two regions?

Yes you can, just scale to zero the image automation controllers on all regions except for one.

i think that will bite us some time down the road.
there will be a massive amount of overlays per app here. We have to take in to account blue/green clusters as well.

so this will be correct, and all imageAutomations are exactly identical

├── deployment
│   ├── base
│   │   ├── **/*.yaml
│   ├── overlays
│   │   ├── dev
│   │   │   ├── region 1
│   │   │   │   ├── blue
│   │   │   │   │   ├── imageAutomation.yaml
│   │   │   │   ├── green
│   │   │   │   │   ├── imageAutomation.yaml
│   │   │   ├── region 2
│   │   ├── staging
│   │   ├── prod

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants