Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending images with Marinara results in inefficient images #8

Open
lbussell opened this issue Jul 15, 2024 · 0 comments
Open

Extending images with Marinara results in inefficient images #8

lbussell opened this issue Jul 15, 2024 · 0 comments

Comments

@lbussell
Copy link

Hello, I've been experimenting with using Marinara for creating Distroless .NET Containers on Azure Linux 3.0.

While using Marinara to create images works very well and results in a perfectly efficient image, using Marinara to extend an existing image does not.

I noticed a couple of behaviors:

  1. When using the recommended pattern from dockerfile-extend-image, the final layer is squashed. While this results in a perfectly efficient image, all layer history and environment variables are lost. This has numerous downsides from increased build time to the inability to share the layer with another image.
  2. Without squashing the final layer, using Marinara to extend an existing image results in a considerable amount of wasted space in the image. Many files are partially or completely overwritten/duplicated in the overlay FS, which results in a very inefficient image.

Here's an example of where I created a .NET Runtime Deps image with Marinara, then tried to extend it by installing ca-certificates, icu, and tzdata packages.

Base image:

FROM azurelinuxpreview.azurecr.io/public/azurelinux/marinara:3.0 AS builder

RUN marinaracreate.py \
    --image-type "minimal-nonroot" \
    --azure-linux-version "3.0" \
    --location "/staging" \
    --add-packages "prebuilt-ca-certificates glibc libgcc libstdc++ openssl-libs zlib" \
    --packages-to-holdback "" \
    --user "app" \
    --user-uid "1654" \
    --user-gid "1654"

# .NET runtime-deps image
FROM scratch

ENV \
    # UID of the non-root user 'app'
    APP_UID=1654 \
    # Configure web servers to bind to port 8080 when present
    ASPNETCORE_HTTP_PORTS=8080 \
    # Enable detection of running in a container
    DOTNET_RUNNING_IN_CONTAINER=true \
    # Set the invariant mode since ICU package isn't included (see https://github.com/dotnet/announcements/issues/20)
    DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=true

COPY --from=builder /staging/ /

# Workaround for https://github.com/moby/moby/issues/38710
COPY --from=builder --chown=1654:1654 /staging/home/ /home/

USER app

Extended image:

FROM $MY_MARINARA_BASE_IMAGE AS base

FROM azurelinuxpreview.azurecr.io/public/azurelinux/marinara:3.0 AS builder

COPY --from=base /var/lib/rpmmanifest/ /tmp/rpmmanifest/

RUN marinaraextend.py \
    --azure-linux-version "3.0" \
    --location "/staging" \
    --add-packages "ca-certificates icu tzdata" \
    --packages-to-holdback "" \
    --existing-manifest-location "/tmp/rpmmanifest" \
    --new-manifest-location "/var/lib/rpmmanifest" \
    --user "app" \
    --user-uid "1654" \
    --user-gid "1654"

FROM base as final

COPY --from=builder /staging/ /

# Workaround for https://github.com/moby/moby/issues/38710
COPY --from=builder --chown=1654:1654 /staging/home/ /home/

# Optional additional layer squash - gets rid of ENVs?
# FROM scratch
# COPY --from=final / /
# COPY --from=final --chown=1654:1654 /home/ /home/

USER $APP_UID

Running a diagnostic tool, dive on the output of the second image, you can see there's tons of wasted space:

PS> .\dive.exe --ci marinaraextend
  Using default CI config
Image Source: docker://marinaraextend
Fetching image... (this can take a while for large images)
Analyzing image...
  efficiency: 77.7069 %
  wastedBytes: 47767848 bytes (48 MB)
  userWastedPercent: 56.5502 %
Inefficient Files:
Count  Wasted Space  File Path
    2         10 MB  /usr/lib/libcrypto.so.3.3.0
    2        5.4 MB  /usr/lib/libstdc++.so.6.0.32
    2        5.2 MB  /usr/lib/locale/en_US.utf8/LC_COLLATE
    2        4.7 MB  /usr/lib/libc.so.6
    2        2.7 MB  /usr/sbin/ldconfig
    2        2.1 MB  /usr/lib/libssl.so.3.3.0
    2        2.0 MB  /usr/lib/libmvec.so.1
    2        1.9 MB  /usr/lib/libm.so.6
    2        893 kB  /usr/share/i18n/charmaps/UTF-8.gz
    2        843 kB  /etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt
    2        803 kB  /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
    2        715 kB  /usr/lib/locale/C.utf8/LC_CTYPE
    2        627 kB  /etc/pki/ca-trust/extracted/pem/email-ca-bundle.pem
    2        619 kB  /usr/bin/localedef
    2        586 kB  /etc/pki/ca-trust/extracted/java/cacerts
    2        584 kB  /etc/pki/ca-trust/extracted/edk2/cacerts.bin
    2        432 kB  /usr/lib/ld-linux-x86-64.so.2
    2        402 kB  /etc/pki/tls/cert.pem
    2        299 kB  /usr/lib/libgcc_s.so.1
    2        293 kB  /etc/pki/java/cacerts
    2        274 kB  /usr/lib/ossl-modules/legacy.so
    2        210 kB  /usr/share/gcc-13.2.0/python/libstdcxx/v6/printers.py
    2        205 kB  /usr/lib/libz.so.1.3.1
    2        200 kB  /usr/lib/libnsl.so.1
    2        132 kB  /usr/bin/iconv
    2        124 kB  /usr/lib/libresolv.so.2
    2        122 kB  /usr/sbin/zic
    2        112 kB  /usr/lib/libc_malloc_debug.so.0
    2        107 kB  /usr/lib/engines-3/loader_attic.so
<snip>

I truncated all of the overwritten files under 100 kB. It seems like most or all of the packages from the first layer are being copied over to the base layer a second time, resulting in over 50% wasted space.

I would consider these downsides to be a deal-breaker when it comes to recommending Marinara to users who wish to add packages to existing Azure Linux distroless images.
 
Marinara should have a way to extend images by adding packages on an additional layer, resulting in an image that is reasonably size-efficient (95-99%), without resorting to squashing layers or using other workarounds to reduce the number of duplicated files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant