Introduction
Julia is a modern programming language with a fairly large package ecosystem (currently ~10k packages) that provide all kinds of useful functionality to build on. Packages are registered in a global registry called General, which is installed by default by the package manager, allowing users to easily add and use registered packages.
Packages are identified by a name (like Convex.jl, DataFrames.jl, or
Makie.jl) and a UUID (like
f65535da-76fb-5f13-bab9-19810c17039a). General has a rule
that every package has to have a unique name (although Julia and its
package manager Pkg.jl can tolerate name collisions in nested
dependencies), to avoid confusion. General also has many
guidelines (as opposed to strict rules) around package names,
such as avoiding overly similar names (e.g. the name Websocket.jl was
rejected as there was already as WebSockets.jl), avoiding acronyms,
names shorter than 5 letters, and so forth. When a new package is being
registered, automation checks the guidelines and refuses to auto-merge
packages which violate them. That is, such packages need to be manually
merged by a registry maintainer. If the package isn’t breaking a hard
rule (such as a name collision), they can still be merged at the
discretion of the maintainers, but it’s not a guarantee. There is a
sense that the namespace is a shared resource and registering under the
name you want is not an automatic right because it affects everyone (by
consuming that shared resource in potentially more or less useful
ways).
I have become one of these maintainers and so I find myself sometimes trying to answer questions like: should I merge this package registration? What factors should I consider? Should I just not “get involved” at all? (But isn’t that also de-facto making a decision?) Why do I have this power and what responsibility does it come with?
I’ll try to work backwards through these.
How did I become a maintainer & should I exercise that power?
I think the last question is in some ways the most straightforward but least satisfying: I became a registry maintainer because I got involved in improving the automation that powers the registry, and at some point years ago I was given commit access to General to facilitate that work. I believe at the time I asked if I should only use it viz-a-viz the automation and was told no, just use it responsibly, but I have a truly terrible memory and really can’t say for sure if that exchange happened or was just in my head. At least no one has told me not to merge stuffFeel free to contact me if you have issues with my use of commit privileges to General, and I will try my best to not be defensive..
So then I have this possibly-self-proclaimed power as “registry maintainer”. Should I exercise it at all? Sometimes I avoid it; if I’m not sure about something or it seems like a complicated situation, it’s at least easier to not get involved. But it’s a specific form of community moderation, and I think community moderation is an important thing. Julia’s community forum software tells me I’ve spend 40 days reading posts there over the years, I am convinced it would be a much less pleasant place to be if it weren’t for the excellent moderators like Matt Bauman. So I think if I can try to approach issues that arise in General (typically “I want my package to be named X, but the guidelines or another community member disagrees”) thoughtfully and considerately, then that’s a good thingAnd on the flip side, if I’m feeling frustrated, I think it’s probably better to stay out of it or approach it another time..
What factors should I consider in registration decisions?
So, if I am going to try to get involved and make a registration decision (typically, merging a package registration), what factors should I consider in doing so?
First, I think it helps to discuss a bit more background. General is in an interesting position as a package registry, where it is more permissive than some registries like R’s CRAN or TeX’s CTAN (which involve human review for every registration), but more restrictive than more free-for-all registries like npm, PyPi, and Cargo, which automatically and ~instantly register eligible packages. General’s approach is to place a 3-day waiting period for community review, during which time anyone with a GitHub account, as well as our automated guideline checks, can block auto-registration, but it’s “default yes” in the sense that if there are no blocks, the registration is automatically merged after the waiting period.
I’ve written about this a bit more on a community discussion I started about disallowing vibe-coded packages in General here. That discussion surfaced a spectrum of opinions on permissiveness, from Christopher Rackauckas’s opinion that
Everyone still remembers Tony right? We learned all the way back then that having a code czar doesn’t scale. That’s when we changed to General being permissive by default, and it was required in order to make v1.0 Julia work given the growth we had. From time to time someone comes in trying to police it again, and every time that happens they grow tired pretty quickly for the same reason.
I think we just have to be permissive. We should test what’s easy, for example we should probably require docs and not auto-merge anything without docs or CI. But beyond what is easy to test, the human labor should flag issues when we can, rollback / ban when we need to, etc. and be willing to accept that there are some things in General which may not match what we call high quality and that’s okay as long as any truly bad issues (something gets abandoned, name-squatted, security issue, etc.) are what the human time is saved for rapidly handling.
to Micheal Goerz’s reply
I think if anything, we’re moving towards tightening the supervision of the General registry. Certainly, there’s a lot more emphasis on good package naming. We don’t generally do detailed code / quality reviews, but it’s a lot more regulated the PyPI. I feel like the Julia community takes the “collective ownership” of the ecosystem pretty seriously, and that’s a good thing.
Overall, we don’t really currently have a full “system” for making these decisions. We have guidelines, we have past precedent, we have community values, and we have judgement calls.
Three-letter package names
I started writing this post because I wanted to discuss a particular
registration question that came up
recentlyBut I have an endless need to provide context, which is why we’re 1,000
words in and just getting to the point.. A package author
wanted to register a package named Ark.jl, but it does not
meet the guideline for package names to be at least 5 letters, so it is
not eligible for auto-registration. This is a pretty decent name, in
that it is not overtly similar to another name (there’s no Arks.jl for
example), it’s somewhat “unique” (it’s not “Analysis.jl” or whatever),
it’s not jargon-y, it’s not an acronym, and it is a Julia port of an
existing Go project (https://github.com/mlange-42/ark) by the same
author. But there can’t be that many three letter names: if we
use the guideline for the Damerau–Levenshtein distance between
lowercased names to be at least 2, and assume the first letter is
capitalized, the latter two are not, and using letters not digits,
there’s somewhere between 654 and 676 names available, as shown by the
following mixed-integer linear program
solveI actually tried a few things to get the number here, and I think
there’s probably a clever way to do it. I ended up just running a
mixed-integer solver for 30 minutes and reporting the bounds from there,
since it’s good enough to make the point. If someone comes up with way
the exact number, let me know! I will update the post.
edit: I wrote a followup here with a full
solution.. That’s not that many!
Code
# Code written by ChatGPT 5.1 thinking
using JuMP
using HiGHS
using MathOptInterface
const MOI = MathOptInterface
# -----------------------------
# 1. All [A-Z][a-z][a-z] names
# -----------------------------
function all_names()
names = String[]
for C in 'A':'Z', x in 'a':'z', y in 'a':'z'
push!(names, string(C, x, y))
end
return names
end
# ----------------------------------------------------------
# 2. DL=1 neighbors for 3-letter lowercase strings
#
# Damerau–Levenshtein distance = 1 (for equal length) means:
# - one substitution in any position, OR
# - one adjacent transposition (1↔2, 2↔3)
# We enumerate those directly instead of calling the metric.
# ----------------------------------------------------------
function neighbors_DL1(s::String)
@assert ncodeunits(s) == 3
c1, c2, c3 = s[1], s[2], s[3]
neigh = String[]
# substitutions
for a in 'a':'z'
if a != c1
push!(neigh, string(a, c2, c3))
end
if a != c2
push!(neigh, string(c1, a, c3))
end
if a != c3
push!(neigh, string(c1, c2, a))
end
end
# adjacent transpositions
if c1 != c2
push!(neigh, string(c2, c1, c3)) # swap 1,2
end
if c2 != c3
push!(neigh, string(c1, c3, c2)) # swap 2,3
end
return neigh
end
# ----------------------------------------------------------
# 3. Build conflict edge list for full DL≥2 constraint
#
# DL(lowercase(x), lowercase(y)) ≥ 2
# ⇔ forbid DL = 0 or 1.
# DL = 0 = duplicate name; we just don't include duplicates.
# DL = 1 neighbors are exactly neighbors_DL1().
# ----------------------------------------------------------
function build_conflicts_full()
names = all_names()
lowers = lowercase.(names)
N = length(names)
# map lowercase string -> index 1..N
name_to_idx = Dict{String,Int}(lowers .=> collect(1:N))
conflicts = Tuple{Int,Int}[]
for i in 1:N
s = lowers[i]
for t in neighbors_DL1(s)
j = name_to_idx[t]
if j > i
push!(conflicts, (i, j))
end
end
end
return names, conflicts
end
# ----------------------------------------------------------
# 4. Build MIS MILP model with 30-minute timeout
# Maximize number of chosen names
# subject to: x[i] + x[j] ≤ 1 for every DL=1 pair (i,j).
# ----------------------------------------------------------
function build_max_DL_model()
names, conflicts = build_conflicts_full()
N = length(names)
println("Total candidate names: $N")
println("Conflict edges (DL=1 pairs): ", length(conflicts))
model = Model(HiGHS.Optimizer)
# 30 minute time limit
set_attribute(model, MOI.TimeLimitSec(), 1800.0)
@variable(model, x[1:N], Bin)
@objective(model, Max, sum(x))
for (i, j) in conflicts
@constraint(model, x[i] + x[j] <= 1)
end
return model, names, x
end
# ----------------------------------------------------------
# 5. Solve and report bound
# ----------------------------------------------------------
function solve_with_timeout()
model, names, x = build_max_DL_model()
println("\nSolving with HiGHS (30 min time limit)...")
optimize!(model)
return model, names, x
end
solve_with_timeout()which yielded
Total candidate names: 17576
Conflict edges (DL=1 pairs): 676000
Solving with HiGHS (30 min time limit)...
Running HiGHS 1.12.0 (git hash: 755a8e027): Copyright (c) 2025 HiGHS under MIT licence terms
MIP has 676000 rows; 17576 cols; 1352000 nonzeros; 17576 integer variables (17576 binary)
Coefficient ranges:
Matrix [1e+00, 1e+00]
Cost [1e+00, 1e+00]
Bound [1e+00, 1e+00]
RHS [1e+00, 1e+00]
Presolving model
676000 rows, 17576 cols, 1352000 nonzeros 1s
18929 rows, 17576 cols, 103431 nonzeros 3s
18929 rows, 17576 cols, 103431 nonzeros 4s
Presolve reductions: rows 676000(-0); columns 17576(-0); nonzeros 1352000(-0) - Not reduced
Objective function is integral with scale 1
Solving MIP model with:
18929 rows
17576 cols (17576 binary, 0 integer, 0 implied int., 0 continuous, 0 domain fixed)
103431 nonzeros
Src: B => Branching; C => Central rounding; F => Feasibility pump; H => Heuristic;
I => Shifting; J => Feasibility jump; L => Sub-MIP; P => Empty MIP; R => Randomized rounding;
S => Solve LP; T => Evaluate node; U => Unbounded; X => User solution; Y => HiGHS solution;
Z => ZI Round; l => Trivial lower; p => Trivial point; u => Trivial upper; z => Trivial zero
Nodes | B&B Tree | Objective Bounds | Dynamic Constraints | Work
Src Proc. InQueue | Leaves Expl. | BestBound BestSol Gap | Cuts InLp Confl. | LpIters Time
z 0 0 0 0.00% inf -0 Large 0 0 0 0 5.2s
J 0 0 0 0.00% inf 1 Large 0 0 0 0 5.3s
S 0 0 0 0.00% 1300 23 5552.17% 0 0 0 0 12.6s
R 0 0 0 0.00% 676 24 2716.67% 0 0 0 10393 12.7s
S 0 0 0 0.00% 676 26 2500.00% 86 3 0 10696 18.5s
0 0 0 0.00% 676 26 2500.00% 176 7 0 11288 24.6s
C 0 0 0 0.00% 676 27 2403.70% 253 8 0 11474 29.9s
0 0 0 0.00% 676 27 2403.70% 329 9 0 11650 35.0s
0 0 0 0.00% 676 27 2403.70% 447 11 0 12044 41.9s
0 0 0 0.00% 676 27 2403.70% 528 14 0 12387 49.3s
L 0 0 0 0.00% 676 628 7.64% 580 16 0 12675 82.5s
S 0 0 0 0.00% 676 638 5.96% 580 15 0 45715 474.1s
B 0 0 0 0.00% 676 645 4.81% 580 15 0 45715 474.2s
B 480 469 0 0.00% 676 648 4.32% 726 20 0 929147 860.2s
634 551 1 0.00% 676 648 4.32% 771 22 1 1005k 879.6s
T 654 547 12 0.00% 676 649 4.16% 771 22 1 1006k 880.7s
734 610 19 0.00% 676 649 4.16% 832 24 1 1035k 902.3s
848 675 37 0.00% 676 649 4.16% 837 6 1 1074k 922.6s
951 757 46 0.00% 676 649 4.16% 882 7 1 1109k 941.7s
T 953 750 47 0.00% 676 650 4.00% 882 7 4 1110k 942.1s
Nodes | B&B Tree | Objective Bounds | Dynamic Constraints | Work
Src Proc. InQueue | Leaves Expl. | BestBound BestSol Gap | Cuts InLp Confl. | LpIters Time
1043 832 53 0.00% 676 650 4.00% 944 9 58 1138k 957.5s
T 1050 810 57 0.00% 676 651 3.84% 944 9 58 1138k 957.9s
T 1065 800 60 0.00% 676 652 3.68% 944 9 58 1139k 958.6s
1148 871 62 0.00% 676 652 3.68% 1152 11 58 1170k 974.3s
1240 951 72 0.00% 676 652 3.68% 1152 12 58 1198k 989.4s
T 1264 937 84 0.00% 676 653 3.52% 1152 12 60 1200k 991.3s
1339 1004 86 0.00% 676 653 3.52% 1213 10 61 1228k 1007.5s
1430 1075 101 0.00% 676 653 3.52% 1188 12 61 1263k 1035.6s
1941 1611 119 0.00% 676 653 3.52% 1269 14 61 1650k 1320.3s
2002 1610 120 0.00% 676 653 3.52% 1304 16 61 2091k 1710.7s
2107 1674 138 0.00% 676 653 3.52% 1365 17 61 2128k 1730.5s
2205 1739 156 0.00% 676 653 3.52% 1333 17 61 2160k 1747.6s
T 2241 1694 172 0.00% 676 654 3.36% 1333 17 81 2163k 1750.6s
2310 1754 174 0.00% 676 654 3.36% 1358 12 81 2201k 1768.9s
2421 1824 191 0.00% 676 654 3.36% 1300 12 81 2248k 1790.5s
2467 1919 204 0.00% 676 654 3.36% 1382 13 81 2262k 1800.2s
2467 1919 204 0.00% 676 654 3.36% 1382 13 81 2262k 1800.2s
Solving report
Status Time limit reached
Primal bound 654
Dual bound 676
Gap 3.36% (tolerance: 0.01%)
P-D integral 11301.577112
Solution status feasible
654 (objective)
0 (bound viol.)
4.4408920985e-16 (int. viol.)
0 (row viol.)
Timing 1800.20
Max sub-MIP depth 6
Nodes 2467
Repair LPs 0
LP iterations 2262922
558768 (strong br.)
5277 (separation)
786633 (heuristics)So, should the name be accepted? I tried to see what we have done before. To that end, I queried the GitHub API to pull down all the registration requests and the comments made on them. I filtered to only requests that were closed or were merged by not-a-robot, indicating human intervention (or lack thereof). Then, for this particular question, I selected only the registration attempts for packages with names of length 3, which left me with 210 registrations. These are tabulated here along with the following plots, analysis scripts, and so forth.
First, I found that we indeed seem to be getting stricter over time, with fewer accepted registrations than in the past:

Next, I categorized each accepted registration into one of six categories, using LLMs to analyze the commentsI spot checked the results and they look quite good.:

We see that the “discretionary” has always been the largest one, but it too has been declining.
I also categorized the rejected registrations into six categories with the same methodology; here we can see the results:

We see that “duplicate/superseded PR” is increasingly common. This typically means the package author agreed to a new, longer, name and made a new registration for it.
So, what to make of this? My takeaway in this case was:
Seems like we sometimes merge stuff like this, sometimes we don’t. To me ArkECS.jl is a better name as it describes that its an ECS and I don’t think Ark is well known enough as a trigram to indicate it’s an ECS otherwise. But on the other hand, I don’t really see some other domain that should get to claim Ark either. So probably no one should get Ark or we should merge this.
In my opinion, the best reason for “no one should” is we could make no-3-letter-packages a hard rule so we don’t have to deliberate every time. But we don’t actually have that rule now, it’s still ultimately discretionary and we would have to get buy-in from everyone with commit access to General, otherwise some will still get merged depending on who happens to be looking at the PR.
So I guess I’m leaning towards merging in a day or two under the discretionary banner, unless someone has an objection, or [the author] is OK with ArkECS.jl.
The author kindlyThis really does help. Some folks can be pretty upset when their registration is blocked, especially if they are coming from another ecosystem with a much more permissive registry. They can feel like registering with a preferred name is a right they have that is being infringed on. That’s not the case in the Julia world, but I can see how it can feel frustrating if your expectations are different. expressed their preference for Ark.jl, and I merged it. I’m still not totally sure this was the right decision, but at least it made the package author happy and doesn’t seem to harm anyone.
Do we need a system for this?
I think the long-term trend will basically be what Chris said in the discourse post I quoted above: over time, manual arbitration burns people out, and we will always end up defaulting to whatever gets auto-merged. So I think the best way to make a sustainable influence on the system is with thoughtful automation. Of course, I would say that, as that as what I’ve mostly worked onor at least, have tried to do over years with regards to General 🙂.
But the question remains as to what the role of discretionary power is here. I am certainly uncomfortable with using it; it feels very difficult to do fairly. Having some framework would likely help, at least to provide some skaffolding upon which to make any particular decision. But it also feels like there will always be various edge cases and situational things, and allowing some discretionary power is a good thingI guess I would say this too, as I have that power?.