In the .gitlab-ci.yml
you can set the image
to be any public docker images available on https://hub.docker.com, based on which to run the CI jobs:
build:
image: node:lts # https://hub.docker.com/_/node
script:
- npm run build
However, the base image won't be able to provide all the 3rd libraries required to build the project. So you'll need to download them in the CI job.
build:
image: node:lts
script:
- npm ci # download all the dependencies specified in package.json
- npm run build
The catch here is that you need to download the dependencies in every single job (which takes about 1 minute for a medium to large scale project), not only in the first job. As the docker runner creates fresh containers for every job.
One of the practice is to cache the dependencies with the cache
keyword, but it requires the CI runners to have an external storage service configured, otherwise the the runner won't be able to load the cached dependencies when it's running on a different runner instance.
But the cache
requires an external storage service, such as AWS S3 or Google Cloud Storage (alternatively you can set up a MinIO server as the S3 replacement). And the speed of uploading and downloading the cached files is not that optimal (30s - 1min), not to mention network timeouts can happen when the dependencies getting too large. So using cache
is not really much time saving.
Thankfully the Container Registry feature is available for self-hosted GitLab instance since 12.10, which is a great way to cache the dependencies.
The idea is to build your own docker images with the dependencies installed, and push the image to the Container Registry. And all the subsequent CI jobs run on that image, which is much faster than downloading the dependencies every time, and usually the image will be cached on the docker runner much more efficiently.
A further improvement is to tag the dependencies image with checksum of the package*.json
files, so not only the subsequent jobs in the current pipeline can be faster, but also all the jobs in other pipelines are going to get the speed boost as well, as long as the dependencies are the same (namely the checksum of the package*.json
stay unchanged).
So the npm ci
script in the earlier example can be replaced with image: $IMAGE_DEPENDENCY
:
build:
image: $IMAGE_DEPENDENCY
script:
- npm run build
Step 1. Dockerfile
Create a Dockerfile
in your project root directory:
FROM node:lts
WORKDIR /usr/src/app
COPY package*.json ./
# Dependencies are installed in /usr/src/app/node_modules
RUN npm ci --no-optional
RUN rm package*.json
Be aware that the GitLab runner will erase the CI_PROJECT_DIR
directory (/builds/org-name/project-name
by default) before running the CI job, so you don't want to install your dependencies in that directory.
Step 2. Dependencies installation job
Define the job to build the docker image and generate the IMAGE
variable in the .gitlab-ci.yml
:
install:
image: docker:latest
before_script:
# https://docs.gitlab.com/ee/user/packages/container_registry/#build-and-push-by-using-gitlab-cicd
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
script:
# The tag is based on the combined hash of Dockerfile, package.json and
# package-lock.json.
- CHECKSUM=$(sha256sum Dockerfile package.json package-lock.json | sha256sum | head -c 8)
# https://docs.gitlab.com/ee/ci/variables/predefined_variables.html
# CI_REGISTRY_IMAGE: predefined variable, equal to the project path.
- IMAGE_DEPENDENCY=$CI_REGISTRY_IMAGE/dependency:$CHECKSUM
- docker build --pull --tag $IMAGE_DEPENDENCY .
- docker push $IMAGE_DEPENDENCY
- echo "IMAGE_DEPENDENCY=$IMAGE_DEPENDENCY" > deploy.env
artifacts:
# https://docs.gitlab.com/ee/ci/yaml/#artifactsreportsdotenv
# Mark deploy.env as reports artifact to expose IMAGE as an environment
# variable to the subsequent jobs.
reports:
dotenv: deploy.env
Step 3. Subsequent jobs
And finally let's define the global image
value and a before_script
to link the node_modules
for all the subsequent jobs (leave these two blocks at the top level (without indentation, not on the beginning of the file) of the .gitlab-ci.yml
to make them applicable to all the jobs):
image: $IMAGE_DEPENDENCY
before_script:
# Node.js projects need the dependencies to be installed locally, so we just
# soft link the node_modules from the image to the project directory.
# If your project doesn't need the dependencies installed locally, you can
# skip this step, just make sure the dependencies are installed in the proper
# location in the `Dockerfile`.
- ln -s /usr/src/app/node_modules ./node_modules
Conclusion
With the help of the GitLab Container Registry, the node_modules
dependencies are installed only once as long as they stay untouched. It speeds up the execution by reducing the preparation phase down to about only 5 seconds per job. And if your pipeline has 10 jobs and your team runs the pipeline dozens of times daily, this can save your team few hours of waiting time for the pipeline results. It also helps reduce the cost of running the GitLab CI runner by avoiding the overhead of downloading and installing the same dependencies repeatedly.