Julia's package registration tooling

October 28, 2024

Tags: programming, Julia, packages

Julia has a large ecosystem of over 10,000 registered packages in it’s “General” open-source package registry.

Package registration is mostly automated, but it can be hard to understand how all the various bits fit together. There are some helpful resources and tutorials on how to create and register a package, such as

and I won’t provide a step-by-step tutorial here. Instead, I aim to provide a point-in-time snapshot of many of the different pieces of tooling currently used in the Julia package registration system, and explain how they work together. I will also try to mention how they can be applied to alternative registries beyond General.

Table of contents

About me

I’ve been a maintainer of the General registry for several years, and have mostly worked on one piece of the system called RegistryCI. I also work at a company that maintains a private registry of Julia packages developed for our internal needs, using the same open source tooling as General uses.

Overview

The following diagram tries to show an overview of the whole registration process in a typical case.

graph
    Repo[Repo] --> Commit[Commit<br />c40790cc4d9ad67f074ff372421268bf6ac65f7d] --> R_comment[Package author comments<br/>@JuliaRegistrator register] -- github app webhook --> R_PR[Registrator.jl creates PR]
    R_PR --> G{Registry}
    G -- github actions PR trigger --> RCI1[RegistryCI.jl PR checks] -- commit status --> G

    G -- github actions cron --> RCI2[RegistryCI.jl auto merge] -- merge after waiting period --> G

    G -- github actions cron --> RCI3[RegistryCI.jl TagBot] -- Issue comment after merge --> Comment[TagBot Trigger Issue<br />#123]

    Repo-->Comment

    CustomPR[Package author uses<br/>LocalRegistry.jl to create PR]  --> G

    Comment -- issue-comment github action --> Repo_tag[v1.2.3 tag] -- tag github action --> Docs[Documenter docs release build]

Next, let’s discuss each piece separately.

General

General is the official registry of Julia packages. It is the default registry and provides open source packages via the package manager Pkg.jl to Julia users. There is a lot more one can say about General; in fact, Mosè Giordano and I have developed PackageAnalyzer.jl to inspect the packages contained in General in various ways, and gave JuliaCon 2021 and 2023 talks about the results. In this post, I will just discuss some of the infrastructure and automation of General, which is available to other registries as well.

It is structured as a git repo, where the main branch reflects the current state of the registry, listing all registered packages, their versions, hashes, and so forth.

It is updated with a pull request (PR) workflow, where changes to the registry as proposed via (mostly automated) pull requests, which undergo some checks and verification via continuous integration (CI) testing, and in many cases are approved for automated merging after a waiting period (3 days for new packages, 10 minutes for new versions).

The General github repo interacts with lots of other tools:

Registrator and LocalRegistry can interact with General by making PRs, while RegistryCI works via GitHub actions installed in General itself. At the time of writing, here’s what General’s .github directory looks like:

.github
├── CODEOWNERS
├── dependabot.yml
└── workflows
    ├── CompatHelper.yml
    ├── TagBotTriggers.yml
    ├── author_approval.yml
    ├── automerge.yml
    ├── automerge_staging.yml
    ├── breaking_change_feed.yml
    ├── feed.yml
    ├── feed_manual_prs.yml
    ├── registry-consistency-ci-cron.yml
    ├── registry-consistency-ci.yml
    ├── stale.yml
    └── update_manifests.yml

These workflows power much of the automation around General. Here’s a brief description of each:

Other registries

Other registries should be structured the same way as General, with a top-level Registry.toml file with a unique UUID. Pkg.jl can be configured to pull packages from other registries with pkg> registry add ...url.... This will add them as a git registry, cloning the registry to your depot (e.g. ~/.julia/registries) and git pulling to update it.

Julia also has a package distribution system called PkgServer which can serve registries. I haven’t setup a PkgServer myself.

Registrator.jl

Registrator.jl is an application that can be deployed at as a GitHub app to provide a registration “front end” for a registry. There is an instance maintained by JuliaHub on behalf of the Julia community, which uses a bot github account by the name @JuliaRegistrator.

Registrator supports a github-comment based system, and a web UI, which can be used for gitlab-based packages as well.

Using the comment workflow, a package author starts registration of a new package or new version of a package by:

  1. Installing the github app on their repo (once)
  2. Leaving a comment @JuliaRegistrator register on a commit they wish to register. This also supports subdir=... for packages in a monorepo.

Or, using the webui, which requires a JuliaHub account.

Either way, the package creates a pull request (PR) to General to initiate the registration process.

The @JuliaRegistrator account is “blessed” in that its PRs are eligible for auto-merging. We will talk more about auto-merging when we get to RegistryCI.jl below.

Use with other registries

Registrator is open source and can be deployed for other registries (docs). This needs a small always-on server, such as a t2 AWS EC2 instance. This is what my company does, which provides our registry with an almost-identical workflow to the registering open source packages (one just invokes a different github bot than @JuliaRegistrator).

LocalRegistry.jl

LocalRegistry.jl is a Julia package that can be used instead of Registrator.jl, generally for alternative registries, though it can also be used for General (most useful when doing non-standard things such as bulk registering packages).

LocalRegistry provides a register Julia function that can be used to create a registration PR from a Julia script, as well as a create_registry function to help setting up a new registry.

LocalRegistry.jl is not restricted to “local” registries; it can also be used for github-hosted registries like General. I believe the name comes from the fact that it tries to not make assumptions about where the registry is located, such as assuming it is on GitHub.

I have not used LocalRegistry myself.

RegistryCI.jl

RegistryCI.jl is a library that provides automations intended to be used in continuous integration by registries. It is mostly designed around the needs of General, but can be used by other registries, and is customizable to some extent.

RegistryCI provides four main capabilities:

  1. Registry consistency testing
  2. Checking registration PRs and marking them as approved-for-automerging
  3. Merging approved PRs
  4. Notifying tagbot-enabled package repositories that a new version of the package has been registered.

We’ll now briefly discuss each one.

Registry consistency testing

Registry consistency testing is a set of CI checks run on each PR that verify the registry is still in a consistent state post-PR. This does not check anything particular to a package being registered, but rather than the registry as a whole is in a valid, consistent state. It is run on all PRs to General, whether or not they are registering a package or version. These checks should pass every time; failure indicates some kind of corrupt PR.

Links:

Checking registration PRs and marking them as approved-for-automerging

RegistryCI checks new package and new version registrations against a sequence of guidelines. PRs from approved accounts (i.e. @JuliaRegistrator) which pass all of the guidelines are marked as approved-for-automerging via a GitHub commit status, i.e. the ❌ or ✅ which appears below a PR in GitHub. If some guidelines do not pass, the PR may still be merged manually.

Links:

Merging approved PRs

This is done separately from the per-PR checks, in order to be able to enforce a waiting period before merging PRs, to allow time for community review and comment.

Instead, the cron portion of RegistryCI’s AutoMerge functionality is designed to be run on a schedule1, and to iterate through all open PRs to the registry, and merge them if appropriate:

Links:

Notifying tagbot-enabled package repositories that a new version of the package has been registered

The last function RegistryCI provides is a TagBog integration. Periodically, this GitHub action in General lists recently merged PRs. For each PR, if the package’s repository has a TagBot workflow, it posts a comment on a dedicated “TagBot trigger issue,” prompting TagBot to create a git tag. We will discuss TagBot in slightly more detail below.

Links:

Use with other registries

As RegistryCI can be used with any registry that has a CI service configured. Each of its four capabilities listed above needs to be configured separately, and one can use any subset of them.

There is some documentation here.

TagBot

TagBot provides a CI workflow to monitor the “tagbot trigger issue” for new comments, and upon receiving one, perform a sequence of steps:

  1. find all registered versions of the package in the registry
  2. find all versions registered before a particular lookback period (default: 3 days) by inspecting the git history of the registry
  3. take the difference to get the new versions, and for each create a git release, including creating a changelog

It also has some more complicated functionality for handling release branches, local usage, etc. Additionally, some of TagBot’s internals are complicated because the registry only contains tree SHAs, while the registration PR contains the commit SHA written in the PR body, and for git tags and auto-generated changelogs one needs the commit SHA3.

TagBot is written in python and has had periods of more and less maintenance. There is occasional interest in rewriting a Julia version, and I’m curious to see if it could be made much simpler.

Use with other registries

TagBot can be used with other registries and has some documentation in its README to do so. Those registries need to have RegistryCI’s periodic TagBot workflow configured to create/update the tagbot trigger issues, as General does.

Downstream tag-based workflows (e.g. Documenter.jl)

Git tags can be used to trigger other workflows4, and Documenter.jl uses tags to trigger “release” docs builds. This allows versioned documentation:

documentation selector

Other workflows could trigger on tags as well, but I don’t know any as popular as Documenter. This kind of functionality does not depend on TagBot, RegistryCI, and the rest, but auto-generating tags via this system automates the process.

Why is the registration flow like this?

Why not create package releases by pushing tags to the package’s git repo, and drive the process that way? Users could push a tag, which then triggers a PR to be created on General, and so forth. The issue with that is that the code that gets tagged may not get registered. If there is some problem with the package, such as a missing compatibility bound, that will need to be fixed. Then the tag needs to be mutated, which can cause confusion and reproducibility issues (what if you used the code at the previous tag for something else?). Instead, the Julia package ecosystem has decided to make the registry the source of truth, and have things like git tags be downstream of the merge decision there.


  1. though in fact, it mostly works via a more-complicated stopwatch mechanism, to avoid issues with GitHub’s cron actions being unreliable, see #107794↩︎

  2. if I were writing RegistryCI from scratch, I would have a separate entrypoint and a separate workflow for cron vs per-PR functionality ↩︎

  3. this blog post has some more details. ↩︎

  4. note though when the tag comes from a github workflow, e.g. TagBot, it must be created with the right permissions to be empowered to trigger other workflows. TagBot recommends using an SSH deploy key for this reason↩︎