<- all articles

Decrease your React CI/CD build time by 50%

Hannah Patronoudis

Building web pages anno 2020 is not an easy task. A simple interface that holds animations and HTTP requests can become immensely bloated in a matter of days. Our team at Craftworkz specializes in prototyping applications at a fast pace, which often translated in a “write code, check back later” mentality. One of our prototypes became so bloated, which led us to do some self-reflection. In this blog, we discuss how to decrease your React CI/CD build time by 50%.

Context

The application itself started as a small proof of concept. It had decent client on-boarding and our customer asked for a long-term development trajectory. Our team for the new development cycle consisted of 2 medior full-stack developers with superficial knowledge of CI/CD pipelines. For that reason, our team opted for Gitlab CI/CD pipelines where convenience over configuration is chosen.

We use a React application for our examples. We execute our local development with react-app-rewired, a framework where you can inject custom webpack configuration. More specifically, the application contains 23.955 lines of code, 30 devDependencies, 27 dependencies and a build time of 15.04 minutes.

Gitlab pipelines

One of the key reasons why we used Gitlab is the built-in CI/CD pipeline mechanism. You can start a pipeline for an application by adding a .gitlab-ci.yml file. Gitlab will notice the addition and start the entire process on its own.

pipeline

Let us discuss a few key principles that play a role in the CI/CD optimization.

Stages

A stage is a simple term to group different commands together. Preferably, a stage is grouped by functionality. For example, you would create a category ‘Testing’, where all Cypress tests are executed. This general principle is also applied in most applications for installing dependencies and building the applications. Moreover, such a pipeline divided into in stages would look something like this:

pipeline

Job Artifacts

A job artifact is probably the most important concept for our optimization. One of the last commands that is executed when a pipeline is finished, is the creation of a list of files and directories. Artifacts are uploaded in every successful step and are kept for a week by default. You can add a property expires_in if the time between each git push exceeds 7 days.

The beauty of this technology is that artifacts can be shared across stages. More specifically, sharing files in a pipeline is a common feature in the CI/CD world that gives us the freedom to reuse different files. However, Gitlab takes it a bit more serious where you can use multiple additional flags such as:

  • artifacts:reports:junit
  • artifacts:reports:codequality
  • artifacts:reports:dependency_scanning
code
Example on how to use artifact properties

Time to speed up

Let us first take a small recap and show you our CI/CD configuration before the optimization. Our pipeline contains 3 stages: install, build and deploy.

image: node:10.15.3-alpinestages:   - install   - buildcache:   paths:     - build/     - node_modules/npm_install:   stage: install   script:     - npm install   artifacts:     untracked: truebuild_react_app:   stage: build   dependencies:     - npm_install   script:     - npm run build   artifacts:     paths:      - node_modules/      - build/   only:      - master

The piece of code above is an overview of our old configuration that resulted in a build time of 15 minutes. Let’s take a brief look at a few key points that made our pipeline more temporary.

Cache paths

Caching is a really nice feature on Gitlab. You can compare caching with artifacts except for the fact that artifacts are used between stages and caching is used across the entire pipeline execution.

In the picture above, we create a cache that could be used in other stages, except for the fact that we did not add a key to reuse those folders/files. For that reason, our pipeline is only using artifacts as means of optimization and is adding an unused fingerprint of our node_modules and build folders.

Optimized version

Do you think that our solution has something to do with using caches at the right moment with the right configuration? Well... Then, you’re right!

In order to cache you would need think about what you would like to share across pipeline builds. In our case, we thought of our node_modules. The one folder that is pretty static across our branches and project are the dependencies that the project uses.

What if we use the Gitlab caching mechanism to execute our install stage  every time a branch enters a pipeline?

Package lock

The knight in shining armor that enables the above caching mechanism is our package-lock.json file. In short, a package lock is the blueprint of all used dependencies and their sub-dependencies. The package-lock.json can be updated every time a developer manually updates the dependency versions or performs a npm install. Chances exist that projects with a wide range of dependencies get weekly package-lock updates.

The neat trick for CI/CD pipelines is checking if we can find new dependencies and cache a new version of our node_modules. If not, we can still use the cached version!

image: node:10.15.3-alpinestages:  - install  - buildnpm_install:  stage: install  script:    - npm install  cache:    key: $CI_COMMIT_REF_SLUG-$CI_PROJECT_DIR    paths:      - node_modules/  only:    changes:      - package-lock.jsonbuild_react_app:  stage: build  dependencies:    - npm_install  cache:    key: $CI_COMMIT_REF_SLUG-$CI_PROJECT_DIR    paths:      - node_modules/    policy: pull  script:    - npm run build  artifacts:    paths:      - build/  only:    - master

Cache key

An important aspect of defining the cache property in our npm_install stage is the key: $CI_COMMIT_REF_SLUG-$CI_PROJECT_DIR

CI_COMMIT_REF_SLUG is a shortened version of the branch or tag name for which the project is built, and CI_PROJECT_DIR is the full path where the repository is cloned. The combination of these properties makes cached objectives branch specific.

expires_in

One small detail concerning caches is that a cache can expire.  Artifacts (a way to pass files and folders between stages) hold a property called expire_in. It allows you to specify how long artifacts should live before they expire and are therefore deleted, starting from the moment when they are uploaded and stored on GitLab. Unfortunately, the “expire_in” feature does not apply to caches, so caching does not have this feature.

Development at a high pace would normally lead to expired caches not applying to that repository, as you would frequently keep your dependencies up-to-date or add/remove dependencies. If that is not the case, a simple extra commit would reinitialize your caches. Cool! However, an expire_in property on the caching mechanism would be even greater.

Round up

Most of our time was spent developing our product. After fixing the bugs in the testing environment, our developers knew that they could (figuratively) take a walk around the block. As it took 15 minutes to know if the testing criteria matched again, they had plenty of time to sit and wait... However, not a single developer should have to wait such a long time to know if the code works again.

By adjusting a few parameters, we can easily relieve those developers pains. Not all build pipeline take up 15 minutes but in a general sense, most applications can reduce their pipeline build time by 4 or 5 minutes.

The most important lesson we discovered is that caching branch specific node_modules can improve your CI/CD build time drastically. In our example, we moved from a pipeline that was running for an average of 15 minutes to an average of 7.40 minutes.

Want to know more about Craftworkz’ prototyping applications and how we continue to improve them? Make sure to subscribe to our newsletter on our website to be kept up-to-date with the latest developments!

Written by

Hannah Patronoudis

Want to know more?

Related articles