With many organizations transitioning to container-based workflows, managing the various versions of container images can become quite challenging. Even small organizations can accumulate hundreds of container images, ranging from one-off development tests and emergency fixes to core production images. This raises a critical question: How can we manage image sprawl while still rapidly iterating our images?
A common misconception is that using the "latest" tag ensures you’re pulling the most recent version of an image. Unfortunately, this assumption is incorrect. The "latest" tag simply indicates the last image pushed to the registry.
Read on to learn more about how to avoid this pitfall when using Docker and how to effectively manage your Docker images.
Using Tags
One way to address the issue of image sprawl is by using tags when creating an image. Tags serve as identifiers that help you and others remember what the image is intended for. A recommended approach is to always tag images with their semantic versioning (semver), which allows you to know what version you are deploying. While this sounds like a great approach, there is a caveat.
Unless your registry is configured for immutable tags, tags can be changed. For example, you could tag an image as my-great-app:v1.0.0
and push it to the registry. However, nothing stops a colleague from pushing their updated version of the app with the same tag v1.0.0
. Now, that tag points to their image, not yours. Adding the convenience tag "latest" complicates things even further.
Consider this Dockerfile example:
“`dockerfile
FROM busybox:stable-glibc
Create a script that outputs the version
RUN echo -e "#!/bin/sh\n" > /test.sh && \
echo "echo \"This is version 1.0.0\"" >> /test.sh && \
chmod +x /test.sh
Set the entrypoint to run the script
ENTRYPOINT ["/bin/sh", "/test.sh"]
<br /> <br /> We build the above Dockerfile with `docker build -t tagexample:1.0.0 .` and run it.<br /> <br />
sh
$ docker run –rm tagexample:1.0.0
This is version 1.0.0
<br /> <br /> What if we run it without specifying a tag?<br /> <br />
sh
$ docker run –rm tagexample
Unable to find image ‘tagexample:latest’ locally
docker: Error response from daemon: pull access denied for tagexample, repository does not exist or may require ‘docker login’.
See ‘docker run –help’.
<br /> <br /> Now, we build it without specifying a tag and run it.<br /> <br />
sh
$ docker build .
$ docker run –rm tagexample
This is version 1.0.0
<br /> <br /> The "latest" tag is always applied to the most recent push that did not specify a tag. In our first test, we had an image in the repository with a tag of `1.0.0`, but because we did not have any pushes without a tag, the "latest" tag did not point to an image. However, once we push an image without a tag, the "latest" tag is automatically applied to it.<br /> <br /> Although it is tempting to always pull the "latest" tag, it’s rarely a good idea. The assumption that "latest" points to the most recent version of the image is flawed. For example, another developer can update the application to version `1.0.1`, build it with the tag `1.0.1`, and push it. This results in the following:<br /> <br />
sh
$ docker run –rm tagexample:1.0.1
This is version 1.0.1
$ docker run –rm tagexample:latest
This is version 1.0.0
<br /> <br /> If you assumed that "latest" pointed to the highest version, you’d now be running an outdated version of the image.<br /> <br /> Another issue is that there is no mechanism to prevent someone from inadvertently pushing with the wrong tag. For example, we could create another update to our code bringing it up to `1.0.2`. We update the code, build the image, and push it — but we forget to change the tag to reflect the new version. Although it’s a small oversight, this action results in the following:<br /> <br />
sh
$ docker run –rm tagexample:1.0.1
This is version 1.0.2
“`
Unfortunately, this happens all too frequently.
Using Labels
Since we can’t fully trust tags, how should we ensure that we can identify our images? This is where adding metadata to our images becomes essential.
The first attempt at using metadata to manage images was the MAINTAINER
instruction, which sets the "Author" field (org.opencontainers.image.authors
) in the generated image. However, this instruction has been deprecated in favor of the more powerful LABEL
instruction. Unlike MAINTAINER
, the LABEL
instruction allows you to set arbitrary key/value pairs that can be read with docker inspect
and other tooling.
Unlike tags, labels become part of the image and, when implemented properly, can provide a much better way to determine the version of an image. To revisit our earlier example, let’s see how the use of a label would have made a difference.
To do this, we add the LABEL
instruction to the Dockerfile, along with the key version
and value 1.0.2
.
“`dockerfile
FROM busybox:stable-glibc
LABEL version="1.0.2"
Create a script that outputs the version
RUN echo -e "#!/bin/sh\n" > /test.sh && \
echo "echo \"This is version 1.0.2\"" >> /test.sh && \
chmod +x /test.sh
Set the entrypoint to run the script
ENTRYPOINT ["/bin/sh", "/test.sh"]
<br /> <br /> Now, even if we make the same mistake of tagging the image as version `1.0.1`, we have a way to check that does not involve running the container to see which version we are using.<br /> <br />
sh
$ docker inspect –format='{{json .Config.Labels}}’ tagexample:1.0.1
{"version":"1.0.2"}
“`
Best Practices
Although you can use any key/value as a LABEL
, there are some recommendations. The OCI (Open Container Initiative) provides a set of suggested labels within the org.opencontainers.image
namespace, as shown in the following table:
Label | Content
— | —
org.opencontainers.image.created
| The date and time on which the image was built (string, RFC 3339 date-time).
org.opencontainers.image.authors
| Contact details of the people or organization responsible for the image (freeform string).
org.opencontainers.image.url
| URL to find more information on the image (string).
org.opencontainers.image.documentation
| URL to get documentation on the image (string).
org.opencontainers.image.source
| URL to the source code for building the image (string).
org.opencontainers.image.version
| Version of the packaged software (string).
org.opencontainers.image.revision
| Source control revision identifier for the image (string).
org.opencontainers.image.vendor
| Name of the distributing entity, organization, or individual (string).
org.opencontainers.image.licenses
| License(s) under which contained software is distributed (string, SPDX License List).
org.opencontainers.image.ref.name
| Name of the reference for a target (string).
org.opencontainers.image.title
| Human-readable title of the image (string).
org.opencontainers.image.description
| Human-readable description of the software packaged in the image (string).
Since LABEL
accepts any key/value, it’s also possible to create custom labels. For example, labels specific to a team within a company could use the com.myorg.myteam
namespace. Isolating these to a specific namespace ensures that they can easily be related back to the team that created the label.
Final Thoughts
Image sprawl is a real problem for organizations. If not addressed, it can lead to confusion, rework, and potential production issues. By using tags and labels consistently, you can eliminate these issues and maintain a well-documented set of images that streamline workflows rather than complicate them.
Learn More
For further information on managing Docker images and best practices, visit the Docker Hub.
For more Information, Refer to this article.