Provisioning and third-party dependencies

Zulip is a large project, with well over 100 third-party dependencies, and managing them well is essential to the quality of the project. In this document, we discuss the various classes of dependencies that Zulip has, and how we manage them. Zulip's dependency management has some really nice properties:

The purpose of this document is to detail all of Zulip's third-party dependencies and how we manage their versions.

Provisioning

We refer to "provisioning" as the process of installing and configuring the dependencies of a Zulip development environment. It's done using tools/provision, and the output is conveniently logged by var/log/provision.log to help with debugging. Provisioning makes use of a lot of caching. Some of those caches are not immune to being corrupted if you mess around with files in your repository a lot. We have tools/provision --force to (still fairly quickly) rerun most steps that would otherwise have been skipped due to caching.

In the Vagrant development environment, vagrant provision will run the provision script; vagrant up will boot the machine, and will also run an initial provision the first time only.

PROVISION_VERSION

In version.py, we have a special parameter, PROVISION_VERSION, which is used to help ensure developers don't spend time debugging test/linter/etc. failures that actually were caused by the developer rebasing and forgetting to provision". PROVISION_VERSION has a format of x.y; when x doesn't match the value from the last time the user provisioned, or y is higher than than the value from last time, most Zulip tools will crash early and ask the user to provision. This has empirically made a huge impact on how often developers spend time debugging a "weird failure" after rebasing that had an easy solution. (Of course, the other key part of achieving this is all the work that goes into making sure that provision reliably leaves the development environment in a good state.)

PROVISION_VERSION must be manually updated when making changes that require re-running provision, so don't forget about it!

Philosophy on adding third-party dependencies

In the Zulip project, we take a pragmatic approach to third-party dependencies. Overall, if a third-party project does something well that Zulip needs to do (and has an appropriate license), we'd love to use it rather than reinventing the wheel. If the third-party project needs some small changes to work, we prefer to make those changes and contribute them upstream. When the upstream maintainer is slow to respond, we may use a fork of the dependency until the code is merged upstream; as a result, we usually have a few packages in requirements.txt that are installed from a GitHub URL.

What we look for in choosing dependencies is whether the project is well-maintained. Usually one can tell fairly quickly from looking at a project's issue tracker how well-managed it is: a quick look at how the issue tracker is managed (or not) and the test suite is usually enough to decide if a project is going to be a high-maintenance dependency or not. That said, we do still take on some smaller dependencies that don't have a well-managed project, if we feel that using the project will still be a better investment than writing our own implementation of that project's functionality. We've adopted a few projects in the past that had a good codebase but whose maintainer no longer had time for them.

One case where we apply added scrutiny to third-party dependencies is JS libraries. They are a particularly important concern because we want to keep the Zulip web app's JS bundle small, so that Zulip continues to load quickly on systems with low network bandwidth. We'll look at large JS libraries with much greater scrutiny for whether their functionality justifies their size than Python dependencies, since an extra 50KB of code usually doesn't matter in the backend, but does in JavaScript.

System packages

For the third-party services like PostgreSQL, Redis, Nginx, and RabbitMQ that are documented in the architecture overview, we rely on the versions of those packages provided alongside the Linux distribution on which Zulip is deployed. Because Zulip only supports Ubuntu in production, this usually means apt, though we do support other platforms in development. Since we don't control the versions of these dependencies, we avoid relying on specific versions of these packages wherever possible.

The exact lists of apt packages needed by Zulip are maintained in a few places: * For production, in our Puppet configuration, puppet/zulip/, using the Package and SafePackage directives. * For development, in SYSTEM_DEPENDENCIES in tools/lib/provision.py. * The packages needed to build a Zulip virtualenv, in VENV_DEPENDENCIES in scripts/lib/setup_venv.py. These are separate from the rest because (1) we may need to install a virtualenv before running the more complex scripts that, in turn, install other dependencies, and (2) because that list is shared between development and production.

We also rely on the PGroonga PPA for the PGroonga PostgreSQL extension, used by our full-text search.

Python packages

Zulip uses the version of Python itself provided by the host OS for the Zulip server. We currently support Python 3.6 and newer, with Ubuntu Bionic being the platform requiring 3.6 support. The comments in .github/workflows/zulip-ci.yml document the Python versions used by each supported platform.

We manage Python packages via the Python-standard requirements.txt system and virtualenvs, but there’s a number of interesting details about how Zulip makes this system work well for us that are worth highlighting. The system is largely managed by the code in scripts/lib/setup_venv.py

Upgrading packages

See the README file in requirements/ directory to learn how to upgrade a single Python package.

JavaScript and other frontend packages

We use the same set of strategies described for Python dependencies for most of our JavaScript dependencies, so we won't repeat the reasoning here.

Node and Yarn

These are installed by scripts/lib/install-node (which in turn uses the standard third-party nvm installer to download node and pin its version) and scripts/lib/install-yarn.

ShellCheck and shfmt

In the development environment, the tools/setup/install-shellcheck and tools/setup/install-shfmt scripts download binaries for ShellCheck and shfmt from GitHub, check them against a known hash, and install them to /usr/local/bin. These tools are run as part of the linting system.

Puppet packages

Third-party puppet modules are downloaded from the Puppet Forge into subdirectories under /srv/zulip-puppet-cache, hashed based on their versions; the latest is always symlinked as /srv/zulip-puppet-cache/current. zulip-puppet-apply installs these dependencies immediately before they are needed.

Other third-party and generated files

In this section, we discuss the other third-party dependencies, generated code, and other files whose original primary source is not the Zulip server repository, and how we provision and otherwise maintain them.

Emoji

Zulip uses the iamcal emoji data package for its emoji data and sprite sheets. We download this dependency using npm, and then have a tool, tools/setup/build_emoji, which reformats the emoji data into the files under static/generated/emoji. Those files are in turn used by our Markdown processor and tools/update-prod-static to make Zulip's emoji work in the various environments where they need to be displayed.

Since processing emoji is a relatively expensive operation, as part of optimizing provisioning, we use the same caching strategy for the compiled emoji data as we use for virtualenvs and node_modules directories, with scripts/lib/clean_emoji_cache.py responsible for garbage-collection. This caching and garbage-collection is required because a correct emoji implementation involves over 1000 small image files and a few large ones. There is a more extended article on our emoji infrastructure.

Translations data

Zulip's translations infrastructure generates several files from the source data, which we manage similar to our emoji, but without the caching (and thus without the garbage-collection). New translations data is downloaded from Transifex and then compiled to generate both the production locale files and also language data in locale/language*.json using manage.py compilemessages, which extends the default Django implementation of that tool.

Pygments data

The list of languages supported by our Markdown syntax highlighting comes from the pygments package. tools/setup/build_pygments_data is responsible for generating static/generated/pygments_data.json so that our JavaScript Markdown processor has access to the supported list.

Modifying provisioning

When making changes to Zulip's provisioning process or dependencies, usually one needs to think about making changes in 3 places: