Counting short package names

Introduction

In my post yesterday Thoughts on being a package registry maintainer, I discussed package registration moderation, exercise of power, and some historical moderation decisions. One piece of it was about Julia packages with 3-letter names, and in particular the question of how many such packages there can be. I wrotelightly edited here for clarity:

if we use the guideline for the Damerau–Levenshtein distance between lowercased package names to be at least 2, and assume the first letter is capitalized, the latter two are not, and using letters not digits, there’s somewhere between 654 and 676 names available, as shown by the following mixed-integer linear program solve […]

I also had a footnote:

I actually tried a few things to get the number here, and I think there’s probably a clever way to do it. I ended up just running a mixed-integer solver for 30 minutes and reporting the bounds from there, since it’s good enough to make the point. If someone comes up with way the exact number, let me know! I will update the post.

I ended up coming with a way to prove that 676 is indeed the number of names satisfying those constraints, and this post will discuss both my original method which got me those bounds (mixed-integer linear programming), and a SAT-based solution that confirmed the final number.

Since it will come up throughout this post, let’s also define the Damerau–Levenshtein (DL) distance:

the minimum number of operations (consisting of insertions, deletions or substitutions of a single character, or transposition of two adjacent characters) required to change one word into the other.

— Wikipedia

Mixed-integer linear programs

There, the approach was to encode the problem as a mixed-integer linear program (MILP). A mixed-integer linear program is an optimization problem of a particular form: it has a linear objective function, linear constraints, and integer-valued constraints. If we don’t have any integer-valued constraints, then it’s a linear program, which is convex and can be solved efficiently (in polynomial time). In that case, the feasible set, i.e. the set of points satisfying the constraints, form a convex polyhedron (e.g. imagine a hexagon), and we simply want to minimize a linear function over it. Derivatives always point us in the right direction.

However, with a MILP, integer-valued constraints mean the problem is not convex, and there is no known polynomial time algorithm for solving MILPs. Typically, MILP solvers involve branch-and-bound, where if you have some integer-valued variable x_i between say 0 and 5, you fix e.g. x_i=0, and solve (or at least bound) the resulting subproblem, then fix x_i=1, and so forth, then take the best solution of those. There are all kinds of techniques to try to reduce the amount of branching and make this tractable, and modern MILP solvers (like the open source HiGHS) are very powerful in practice.

To encode this problem as a MILP, we will maximize the length of a list of names such that all names in the list respect our constraints, i.e. 3-letters, of the form [A-Z][a-z][a-z], such that all pairs of names are “different enough” in Damerau–Levenshtein distance, meaning pairwise distance greater or equal to 2. Iwell, this was ChatGPT’s approach after a bit of iteration with it, but I agree with it at least… did this by:

Enumerate a list of all possible names of the form [A-Z][a-z][a-z]. There’s $26^3=17,576$ of these.
For each name, form all of its “neighbors” that are distance-1 in Damerau–Levenshtein distance away. That is, other names that can be formed from a name by transposing two letters in the name or changing exactly 1 letter for any other letter. (Since all our names are of length-3 we don’t have to worry about deletions or insertions).
Form a vector of boolean variables x of length $26^3$ . x[i] == 1 means the ith name will be included in our list, x[i] == 0 means the ith name will not be included in our list.
Generate all the constraints: for the ith name, allow at most 1 of x[i] and x[j] if the ith and jth names are distance-1 in Damerau–Levenshtein distance. @constraint(model, x[i] + x[j] <= 1)
Set the objective to maximize the sum of x, which is therefore the number of names included on our list.

Here is the code and solver logs (using the JuMP.jl modeling language and the HiGHs solver), copied from my previous post for your convenience.

Code & solver logs

# Code written by ChatGPT 5.1 thinking
using JuMP
using HiGHS
using MathOptInterface
const MOI = MathOptInterface

# -----------------------------
# 1. All [A-Z][a-z][a-z] names
# -----------------------------
function all_names()
    names = String[]
    for C in 'A':'Z', x in 'a':'z', y in 'a':'z'
        push!(names, string(C, x, y))
    end
    return names
end

# ----------------------------------------------------------
# 2. DL=1 neighbors for 3-letter lowercase strings
#
# Damerau–Levenshtein distance = 1 (for equal length) means:
#   - one substitution in any position, OR
#   - one adjacent transposition (1↔2, 2↔3)
# We enumerate those directly instead of calling the metric.
# ----------------------------------------------------------
function neighbors_DL1(s::String)
    @assert ncodeunits(s) == 3
    c1, c2, c3 = s[1], s[2], s[3]
    neigh = String[]

    # substitutions
    for a in 'a':'z'
        if a != c1
            push!(neigh, string(a, c2, c3))
        end
        if a != c2
            push!(neigh, string(c1, a, c3))
        end
        if a != c3
            push!(neigh, string(c1, c2, a))
        end
    end

    # adjacent transpositions
    if c1 != c2
        push!(neigh, string(c2, c1, c3))  # swap 1,2
    end
    if c2 != c3
        push!(neigh, string(c1, c3, c2))  # swap 2,3
    end

    return neigh
end

# ----------------------------------------------------------
# 3. Build conflict edge list for full DL≥2 constraint
#
# DL(lowercase(x), lowercase(y)) ≥ 2
# ⇔ forbid DL = 0 or 1.
# DL = 0 = duplicate name; we just don't include duplicates.
# DL = 1 neighbors are exactly neighbors_DL1().
# ----------------------------------------------------------
function build_conflicts_full()
    names  = all_names()
    lowers = lowercase.(names)
    N = length(names)

    # map lowercase string -> index 1..N
    name_to_idx = Dict{String,Int}(lowers .=> collect(1:N))

    conflicts = Tuple{Int,Int}[]
    for i in 1:N
        s = lowers[i]
        for t in neighbors_DL1(s)
            j = name_to_idx[t]
            if j > i
                push!(conflicts, (i, j))
            end
        end
    end

    return names, conflicts
end

# ----------------------------------------------------------
# 4. Build MIS MILP model with 30-minute timeout
#    Maximize number of chosen names
#    subject to: x[i] + x[j] ≤ 1 for every DL=1 pair (i,j).
# ----------------------------------------------------------
function build_max_DL_model()
    names, conflicts = build_conflicts_full()
    N = length(names)
    println("Total candidate names: $N")
    println("Conflict edges (DL=1 pairs): ", length(conflicts))

    model = Model(HiGHS.Optimizer)
    # 30 minute time limit
    set_attribute(model, MOI.TimeLimitSec(), 1800.0)

    @variable(model, x[1:N], Bin)
    @objective(model, Max, sum(x))

    for (i, j) in conflicts
        @constraint(model, x[i] + x[j] <= 1)
    end

    return model, names, x
end

# ----------------------------------------------------------
# 5. Solve and report bound
# ----------------------------------------------------------
function solve_with_timeout()
    model, names, x = build_max_DL_model()
    println("\nSolving with HiGHS (30 min time limit)...")
    optimize!(model)
    return model, names, x
end

solve_with_timeout()

which yielded

Total candidate names: 17576
Conflict edges (DL=1 pairs): 676000

Solving with HiGHS (30 min time limit)...
Running HiGHS 1.12.0 (git hash: 755a8e027): Copyright (c) 2025 HiGHS under MIT licence terms
MIP has 676000 rows; 17576 cols; 1352000 nonzeros; 17576 integer variables (17576 binary)
Coefficient ranges:
  Matrix  [1e+00, 1e+00]
  Cost    [1e+00, 1e+00]
  Bound   [1e+00, 1e+00]
  RHS     [1e+00, 1e+00]
Presolving model
676000 rows, 17576 cols, 1352000 nonzeros  1s
18929 rows, 17576 cols, 103431 nonzeros  3s
18929 rows, 17576 cols, 103431 nonzeros  4s
Presolve reductions: rows 676000(-0); columns 17576(-0); nonzeros 1352000(-0) - Not reduced
Objective function is integral with scale 1

Solving MIP model with:
   18929 rows
   17576 cols (17576 binary, 0 integer, 0 implied int., 0 continuous, 0 domain fixed)
   103431 nonzeros

Src: B => Branching; C => Central rounding; F => Feasibility pump; H => Heuristic;
     I => Shifting; J => Feasibility jump; L => Sub-MIP; P => Empty MIP; R => Randomized rounding;
     S => Solve LP; T => Evaluate node; U => Unbounded; X => User solution; Y => HiGHS solution;
     Z => ZI Round; l => Trivial lower; p => Trivial point; u => Trivial upper; z => Trivial zero

        Nodes      |    B&B Tree     |            Objective Bounds              |  Dynamic Constraints |       Work
Src  Proc. InQueue |  Leaves   Expl. | BestBound       BestSol              Gap |   Cuts   InLp Confl. | LpIters     Time

 z       0       0         0   0.00%   inf             -0                 Large        0      0      0         0     5.2s
 J       0       0         0   0.00%   inf             1                  Large        0      0      0         0     5.3s
 S       0       0         0   0.00%   1300            23              5552.17%        0      0      0         0    12.6s
 R       0       0         0   0.00%   676             24              2716.67%        0      0      0     10393    12.7s
 S       0       0         0   0.00%   676             26              2500.00%       86      3      0     10696    18.5s
         0       0         0   0.00%   676             26              2500.00%      176      7      0     11288    24.6s
 C       0       0         0   0.00%   676             27              2403.70%      253      8      0     11474    29.9s
         0       0         0   0.00%   676             27              2403.70%      329      9      0     11650    35.0s
         0       0         0   0.00%   676             27              2403.70%      447     11      0     12044    41.9s
         0       0         0   0.00%   676             27              2403.70%      528     14      0     12387    49.3s
 L       0       0         0   0.00%   676             628                7.64%      580     16      0     12675    82.5s
 S       0       0         0   0.00%   676             638                5.96%      580     15      0     45715   474.1s
 B       0       0         0   0.00%   676             645                4.81%      580     15      0     45715   474.2s
 B     480     469         0   0.00%   676             648                4.32%      726     20      0    929147   860.2s
       634     551         1   0.00%   676             648                4.32%      771     22      1     1005k   879.6s
 T     654     547        12   0.00%   676             649                4.16%      771     22      1     1006k   880.7s
       734     610        19   0.00%   676             649                4.16%      832     24      1     1035k   902.3s
       848     675        37   0.00%   676             649                4.16%      837      6      1     1074k   922.6s
       951     757        46   0.00%   676             649                4.16%      882      7      1     1109k   941.7s
 T     953     750        47   0.00%   676             650                4.00%      882      7      4     1110k   942.1s

        Nodes      |    B&B Tree     |            Objective Bounds              |  Dynamic Constraints |       Work
Src  Proc. InQueue |  Leaves   Expl. | BestBound       BestSol              Gap |   Cuts   InLp Confl. | LpIters     Time

      1043     832        53   0.00%   676             650                4.00%      944      9     58     1138k   957.5s
 T    1050     810        57   0.00%   676             651                3.84%      944      9     58     1138k   957.9s
 T    1065     800        60   0.00%   676             652                3.68%      944      9     58     1139k   958.6s
      1148     871        62   0.00%   676             652                3.68%     1152     11     58     1170k   974.3s
      1240     951        72   0.00%   676             652                3.68%     1152     12     58     1198k   989.4s
 T    1264     937        84   0.00%   676             653                3.52%     1152     12     60     1200k   991.3s
      1339    1004        86   0.00%   676             653                3.52%     1213     10     61     1228k  1007.5s
      1430    1075       101   0.00%   676             653                3.52%     1188     12     61     1263k  1035.6s
      1941    1611       119   0.00%   676             653                3.52%     1269     14     61     1650k  1320.3s
      2002    1610       120   0.00%   676             653                3.52%     1304     16     61     2091k  1710.7s
      2107    1674       138   0.00%   676             653                3.52%     1365     17     61     2128k  1730.5s
      2205    1739       156   0.00%   676             653                3.52%     1333     17     61     2160k  1747.6s
 T    2241    1694       172   0.00%   676             654                3.36%     1333     17     81     2163k  1750.6s
      2310    1754       174   0.00%   676             654                3.36%     1358     12     81     2201k  1768.9s
      2421    1824       191   0.00%   676             654                3.36%     1300     12     81     2248k  1790.5s
      2467    1919       204   0.00%   676             654                3.36%     1382     13     81     2262k  1800.2s
      2467    1919       204   0.00%   676             654                3.36%     1382     13     81     2262k  1800.2s

Solving report
  Status            Time limit reached
  Primal bound      654
  Dual bound        676
  Gap               3.36% (tolerance: 0.01%)
  P-D integral      11301.577112
  Solution status   feasible
                    654 (objective)
                    0 (bound viol.)
                    4.4408920985e-16 (int. viol.)
                    0 (row viol.)
  Timing            1800.20
  Max sub-MIP depth 6
  Nodes             2467
  Repair LPs        0
  LP iterations     2262922
                    558768 (strong br.)
                    5277 (separation)
                    786633 (heuristics)

Using a mixed-integer program is nice because once you’re familiar with the tricks (e.g. boolean variables to represent inclusion/exclusion), there’s a pretty large set of problems you can represent and solve uniformly. Additionally, you get “exact” solutions: the solvers results are not heuristic, they are rigorous bounds, and if they say the answer is, say, 73, then the answer typically really is 73This may sound like table-stakes for problem solving, but it really is something special! With many non-convex problems we can’t even know how many global minima there are, let alone if we’ve found one..

However, while the modern MILP solver is a wonder of theory and engineering, they can of course be slow for large problems. Our problem here for the 3-letter names has 17,576 boolean variables subject to 1,352,000 constraints, which is not a priori intractable but is on the large side for a casual solve. And after 30 minutes running on my laptop, HiGHs was able to bound the solution to between 654 and 676. Not bad! I didn’t really know if there were hundreds or thousands of namesAlthough I should have been able to work out there’s at most 26² names, as discussed in the next footnote., so that’s actually a pretty informative range. But it’s a bit unsatisfying to not have a concrete number. While we could run the solver for much longer, the fact that $26^2=676$ is the upper bound seems like a bit of a hint, and maybe we can do something different.

An analytical solution to a simpler problem

First, let’s start looking at an easier case. If we remove transpositions, then the rules are a bit simpler: we just require each name must have 2+ letters different from any other name. This is the Hamming distance instead of the DL distance. Can we come up with $26^2$ names in this caseWe can’t have more than 26² names, because if we fix the first two letters, there’s only 26² options. By the pigeonhole principle, if had more than 26² names, at least one of them would have to repeat those first two letters, and then we wouldn’t have Hamming distance at least 2. Somehow was easier for me to see in this case with the Hamming distance, but the same upper bound applies to the DL distance case too: adding more constraints only reduces the number of allowed names.?

Let’s call our name $abc$ , where $a$ is the first letter, $b$ the second, and $c$ the third letter. What we need is for two names $a_1b_1c_1$ and $a_2b_2c_3$ to have at most one overlap. One might write that constraint as $(a_1=a_2) + (b_1=b_2) + (c_1=c_2) \leq 1$ , treating booleans as 0 or 1.

First, to get $26^2$ names, we can choose all pairs of $a$ and $b$ . Two such pairs always differ by at least 1 coordinate (otherwise it’s the same pair!). So it remains to choose $c$ such that:

if both coordinates are different ( $a_1 \neq a_2$ and $b_1 \neq b_2$ ), then $c$ can be the same ( $c_1 = c_2$ ), since we already have distance 2 from the first two coordinates
if one coordinate overlaps ( $a_1 = a_2$ or $b_1 = b_2$ ), then $c$ must be different ( $c_1 \neq c_2$ )

It turns out, the trick is to choose $c = (a + b) \mod 26$ , where we interpret our letters as numbers between 0 and 25. Then if one coordinate is the same, say $a_1=a_2$ , we know $b_1 \neq b_2$ , and

$c_1 - c_2 = (a_1 + b_1) - (a_2 + b_2) = (b_1 - b_2) \neq 0 \mod 26.$

This yields a list of 676 names:

Aaa Abb Acc Add Aee Aff Agg Ahh Aii Ajj Akk All Amm Ann Aoo App Aqq Arr Ass Att Auu Avv Aww Axx Ayy Azz
Bab Bbc Bcd Bde Bef Bfg Bgh Bhi Bij Bjk Bkl Blm Bmn Bno Bop Bpq Bqr Brs Bst Btu Buv Bvw Bwx Bxy Byz Bza
Cac Cbd Cce Cdf Ceg Cfh Cgi Chj Cik Cjl Ckm Cln Cmo Cnp Coq Cpr Cqs Crt Csu Ctv Cuw Cvx Cwy Cxz Cya Czb
Dad Dbe Dcf Ddg Deh Dfi Dgj Dhk Dil Djm Dkn Dlo Dmp Dnq Dor Dps Dqt Dru Dsv Dtw Dux Dvy Dwz Dxa Dyb Dzc
Eae Ebf Ecg Edh Eei Efj Egk Ehl Eim Ejn Eko Elp Emq Enr Eos Ept Equ Erv Esw Etx Euy Evz Ewa Exb Eyc Ezd
Faf Fbg Fch Fdi Fej Ffk Fgl Fhm Fin Fjo Fkp Flq Fmr Fns Fot Fpu Fqv Frw Fsx Fty Fuz Fva Fwb Fxc Fyd Fze
Gag Gbh Gci Gdj Gek Gfl Ggm Ghn Gio Gjp Gkq Glr Gms Gnt Gou Gpv Gqw Grx Gsy Gtz Gua Gvb Gwc Gxd Gye Gzf
Hah Hbi Hcj Hdk Hel Hfm Hgn Hho Hip Hjq Hkr Hls Hmt Hnu Hov Hpw Hqx Hry Hsz Hta Hub Hvc Hwd Hxe Hyf Hzg
Iai Ibj Ick Idl Iem Ifn Igo Ihp Iiq Ijr Iks Ilt Imu Inv Iow Ipx Iqy Irz Isa Itb Iuc Ivd Iwe Ixf Iyg Izh
Jaj Jbk Jcl Jdm Jen Jfo Jgp Jhq Jir Jjs Jkt Jlu Jmv Jnw Jox Jpy Jqz Jra Jsb Jtc Jud Jve Jwf Jxg Jyh Jzi
Kak Kbl Kcm Kdn Keo Kfp Kgq Khr Kis Kjt Kku Klv Kmw Knx Koy Kpz Kqa Krb Ksc Ktd Kue Kvf Kwg Kxh Kyi Kzj
Lal Lbm Lcn Ldo Lep Lfq Lgr Lhs Lit Lju Lkv Llw Lmx Lny Loz Lpa Lqb Lrc Lsd Lte Luf Lvg Lwh Lxi Lyj Lzk
Mam Mbn Mco Mdp Meq Mfr Mgs Mht Miu Mjv Mkw Mlx Mmy Mnz Moa Mpb Mqc Mrd Mse Mtf Mug Mvh Mwi Mxj Myk Mzl
Nan Nbo Ncp Ndq Ner Nfs Ngt Nhu Niv Njw Nkx Nly Nmz Nna Nob Npc Nqd Nre Nsf Ntg Nuh Nvi Nwj Nxk Nyl Nzm
Oao Obp Ocq Odr Oes Oft Ogu Ohv Oiw Ojx Oky Olz Oma Onb Ooc Opd Oqe Orf Osg Oth Oui Ovj Owk Oxl Oym Ozn
Pap Pbq Pcr Pds Pet Pfu Pgv Phw Pix Pjy Pkz Pla Pmb Pnc Pod Ppe Pqf Prg Psh Pti Puj Pvk Pwl Pxm Pyn Pzo
Qaq Qbr Qcs Qdt Qeu Qfv Qgw Qhx Qiy Qjz Qka Qlb Qmc Qnd Qoe Qpf Qqg Qrh Qsi Qtj Quk Qvl Qwm Qxn Qyo Qzp
Rar Rbs Rct Rdu Rev Rfw Rgx Rhy Riz Rja Rkb Rlc Rmd Rne Rof Rpg Rqh Rri Rsj Rtk Rul Rvm Rwn Rxo Ryp Rzq
Sas Sbt Scu Sdv Sew Sfx Sgy Shz Sia Sjb Skc Sld Sme Snf Sog Sph Sqi Srj Ssk Stl Sum Svn Swo Sxp Syq Szr
Tat Tbu Tcv Tdw Tex Tfy Tgz Tha Tib Tjc Tkd Tle Tmf Tng Toh Tpi Tqj Trk Tsl Ttm Tun Tvo Twp Txq Tyr Tzs
Uau Ubv Ucw Udx Uey Ufz Uga Uhb Uic Ujd Uke Ulf Umg Unh Uoi Upj Uqk Url Usm Utn Uuo Uvp Uwq Uxr Uys Uzt
Vav Vbw Vcx Vdy Vez Vfa Vgb Vhc Vid Vje Vkf Vlg Vmh Vni Voj Vpk Vql Vrm Vsn Vto Vup Vvq Vwr Vxs Vyt Vzu
Waw Wbx Wcy Wdz Wea Wfb Wgc Whd Wie Wjf Wkg Wlh Wmi Wnj Wok Wpl Wqm Wrn Wso Wtp Wuq Wvr Wws Wxt Wyu Wzv
Xax Xby Xcz Xda Xeb Xfc Xgd Xhe Xif Xjg Xkh Xli Xmj Xnk Xol Xpm Xqn Xro Xsp Xtq Xur Xvs Xwt Xxu Xyv Xzw
Yay Ybz Yca Ydb Yec Yfd Yge Yhf Yig Yjh Yki Ylj Ymk Ynl Yom Ypn Yqo Yrp Ysq Ytr Yus Yvt Ywu Yxv Yyw Yzx
Zaz Zba Zcb Zdc Zed Zfe Zgf Zhg Zih Zji Zkj Zlk Zml Znm Zon Zpo Zqp Zrq Zsr Zts Zut Zvu Zwv Zxw Zyx Zzy

These don’t respect the full pairwise DL distance constraint, because we ignored the bit about transpositionsTwo names are close in DL distance if you can get from one to another by a transposition of two adjacent letters, e.g. Acb is distance 1 away Abc.. If we greedily filter by including names which are > 1 away in DL distance from any existing names, we find only 350 names satisfying our full set of constraints.

But still, $26^2 = 676$ feels like it could be the right number. Let’s assume there is indeed some set of 676. We can attempt to find a feasible set of 676 another way; rather than solving the full MILP trying to maximize the number, we can assume it’s 676, impose the constraints, then try to find a set of names satisfying those constraints. We will do this with the SAT solver z3.

Finding 676 3-letter package names satisfying our pairwise DL distance constraint

We will take inspiration from our Hamming distance solution, where we fix two letters as all pairs of $a \in 'A':'Z'$ and $b \in 'a':'z'$ , and then attempt to find a viable third letter $c$ . We can consider a $26 \times 26$ matrix $M$ with integer-valued entries $c$ in $1 \leq c \leq 26$ . Then $M[3,4]$ for example corresponds to the name “Cd$x” where $x$ is the $M[3,4]$ ’th entry in the alphabet.

So a row of $M$ corresponds to all package names starting with a particular letter given by the row index, and a column of $M$ corresponds to all packages names whose middle letter is given by a particular letter associated to the column index. Within each row or column, the third letter must never repeat, since then we’d have repeated first & third letter, or repeated second & third letter. These are our Hamming distance constraints. So, so far, the constraints on $M$ are:

values are integers in 1 to 26
each row is a permutation of 1:26 (i.e. no repeats)
each column is a permutation of 1:26

This is a Latin square. However, we need to add our transposition constraints still. To do those, we need to require that given $(a, b, c)$ , all transpositions of adjacent letters are excluded:

if $(a,b,c)$ is in our list, then $(b, a, c)$ and $(a, c, b)$ must not be

We will also fix the first row to be just 1, 2, …, 26 to break the symmetry. That is, we’re requiring our list include the 26 names “Aaa, Abb, Acc, …, Azz”. The reason we can do this is our constraints are invariant under a global relabeling of the alphabet. If the first row was something else, say the alphabet reversed, “z, y, x, …, c, b, a”, then we can just swap $z \mapsto a$ , $y \mapsto b$ , $x \mapsto c$ , etc, everywhere in our solution, and get another valid solution for which the first row is “a, b, c, …, x, y, z”. The reason we do this symmetry breaking is that it constrains the search-space so the solver doesn’t have to investigate other symmetric solutions.

We can encode all of this in a SAT problem. Here I use Julia’s Satisfiability interface to z3:

using Satisfiability

N = 26
vals = 1:N

# M[b,c] is the symbol (1..26) at row b, column c
@satvariable(M[1:N, 1:N], Int)

# 1. Domain constraints: 1 <= M[a,b] <= 26
values_constraint = [
    and(1 <= M[a,b], M[a,b] <= N)
    for a in vals, b in vals
]

# 2. Every row and column has distinct entries
rows_constraint = [distinct(M[a, :]) for a in vals]
columns_constraint = [distinct(M[:, b]) for b in vals]

# 3. DL transposition forbiddances on distinct codewords

# Swap 1 <-> 2:
# (a,b,c) vs (b,a,c) => forbid M[a,b]=c & M[b,a]=c when a != b
forbid_swap12 = [
    not(M[a,b] == M[b, a])
    for a in vals, b in vals
    if a != b
]

# Swap 2 <-> 3:
# (a,b,c) vs (a,c,b) => forbid M[a,b]=c & M[a,c]=b when b != c
forbid_swap23 = [
    not(and(M[a,b] == c, M[a,c] == b))
    for a in vals, b in vals, c in vals
    if b != c
]

# 4. Symmetry breaking:
#    global letter permutations preserve the DL constraints.
#    WLOG fix the first row to 1..26 in order.
symmetry_breaking = [M[1, b] == b for b in vals]

constraints = mapreduce(vec, vcat, 
    (values_constraint,
    rows_constraint,
    columns_constraint,
    forbid_swap12,
    forbid_swap23,
    symmetry_breaking),
)

println("Total constraints: ", length(constraints)) # 18304

M_val = open(Z3()) do solver
    for c in constraints
        assert!(solver, c)
    end

    status, model = @time sat!(solver)
    println("Status: ", status)
    M_val = zeros(Int, N, N)
    for a in vals, b in vals
        key = "M_$(a)_$(b)"
        M_val[a,b] = model[key]
    end
    return M_val
end

This yields:

60.655369 seconds (846.18 k allocations: 42.584 MiB, 1.12% compilation time)
Status: SAT
26×26 Matrix{Int64}:
  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26
  5   6  26  11  20   3  19  15  18   1   2   9   4  13  10  14  25  21  12   7  22  24   8  17  23  16
 11  22   1  12   9  20  17   6   4   3   8  25  19  23   5  26  16   7  15  24  13  21  10  14   2  18
  2  16   5  10  11  14   4  12  24  17   1  26  23  19   3  20  22   8   7   9  25  15  21  13  18   6
 25  18  17  26  22  11  13   4  19   8  15   5  21   9  16   7  10   6  23   1  14   2  12   3  24  20
 21  19   8  18  26   2  16  10   7  13   9   6   3  24  23  15  14  25  22  11  20  12   1   4   5  17
 19  23   4  21   2  25  24  14  11  16   3  20  18  12   6  22   8  10   9  26   7   5  13  15  17   1
 23   8  19  14   7  17   2  13   1  20   5  16  10  15   4   9  18  11  26  12   3   6  24  25  21  22
 24  11  13   5  14   1   9  17  10  21  25  22  20   7   8  12   3  23   4   6  18  26  16   2  15  19
 16  12  24   6   3  23  15  19   2  11  13  10   8  21  22  18   4  26  25  14   5   1  17  20   9   7
 26   1  22   9  19   8  23  25  21   6  20  15   7  18  24  11  13   5  16  17   2  10   3  12   4  14
  3  25  15  16   1  13  22  20  12  14  10   7  26   6  18  17  21  19  24  23   9   8   2   5  11   4
  7  13  10  17  24  26   8   2  15   4   6  18   1  20  14  25  12  16   5  21  23   3  11  22  19   9
  8  21  20   2  12  16   6   9  22  18  24  17   5  11  19   3   7  15  10   4   1  14  26  23  13  25
  6  15  12  24  18  19  26  16  25   9  23   1  17  22  13  21   2   4  14   3  11   7   5   8  20  10
 13  17  21  25  10  24   3   7  14  23   4   8  22  16  20   2   1   9  11  19  15  18   6  26  12   5
  9   4  14  20  21   7  25   5   6  26  22  11  15  10  17  24  23   2   8  18  16  13  19   1   3  12
 10   3   6   1   4  21   5  18  20  25   7  23  11  26  12  13   9  14   2   8  17  16  15  19  22  24
 20  10  16  22  15  18   1  23  17  24  14   4   2   8  21  19   5  12   3  13  26  25   7   9   6  11
 14  20  11   7  17  15  21  24  23  22  26  13  25   5   9   6  19   3   1  10  12   4  18  16   8   2
 18  14   2   3   6  10  12   1   8   7  19  21   9  17  25  23  26  22  13   5  24  20   4  11  16  15
 15  26  23  13  16   9  10  22   5  12  18   3   6  25  11   4  24  20  17   2   8  19  14   7   1  21
 12   7   9  15   8   4  20   3  13   5  16  19  14   2   1  10  11  24  18  25   6  17  22  21  26  23
 17   9  25   8  13  12  14  11  26   2  21  24  16   4   7   5  15   1   6  22  19  23  20  18  10   3
  4   5  18  19  23  22  11  26   3  15  12  14  24   1   2   8  20  17  21  16  10   9  25   6   7  13
 22  24   7  23  25   5  18  21  16  19  17   2  12   3  26   1   6  13  20  15   4  11   9  10  14   8

We can see it only took 60s and found a solution! That means indeed there is a list of 676 names that respects the DL constraints. Let’s see what they are:

function latin_square_to_names(M)
    letters = 'a':'z'
    names = String[]
    for a in 1:length(letters)
        for b in 1:length(letters)
            c = M[a, b]
            A = uppercase(letters[a])
            B = letters[b]
            C = letters[c]
            push!(names, string(A, B, C))
        end
    end
    return names
end

names = latin_square_to_names(M_val)

This gives us the following list of package names:

Aaa Abb Acc Add Aee Aff Agg Ahh Aii Ajj Akk All Amm Ann Aoo App Aqq Arr Ass Att Auu Avv Aww Axx Ayy Azz
Bae Bbf Bcz Bdk Bet Bfc Bgs Bho Bir Bja Bkb Bli Bmd Bnm Boj Bpn Bqy Bru Bsl Btg Buv Bvx Bwh Bxq Byw Bzp
Cak Cbv Cca Cdl Cei Cft Cgq Chf Cid Cjc Ckh Cly Cms Cnw Coe Cpz Cqp Crg Cso Ctx Cum Cvu Cwj Cxn Cyb Czr
Dab Dbp Dce Ddj Dek Dfn Dgd Dhl Dix Djq Dka Dlz Dmw Dns Doc Dpt Dqv Drh Dsg Dti Duy Dvo Dwu Dxm Dyr Dzf
Eay Ebr Ecq Edz Eev Efk Egm Ehd Eis Ejh Eko Ele Emu Eni Eop Epg Eqj Erf Esw Eta Eun Evb Ewl Exc Eyx Ezt
Fau Fbs Fch Fdr Fez Ffb Fgp Fhj Fig Fjm Fki Flf Fmc Fnx Fow Fpo Fqn Fry Fsv Ftk Fut Fvl Fwa Fxd Fye Fzq
Gas Gbw Gcd Gdu Geb Gfy Ggx Ghn Gik Gjp Gkc Glt Gmr Gnl Gof Gpv Gqh Grj Gsi Gtz Gug Gve Gwm Gxo Gyq Gza
Haw Hbh Hcs Hdn Heg Hfq Hgb Hhm Hia Hjt Hke Hlp Hmj Hno Hod Hpi Hqr Hrk Hsz Htl Huc Hvf Hwx Hxy Hyu Hzv
Iax Ibk Icm Ide Ien Ifa Igi Ihq Iij Iju Iky Ilv Imt Ing Ioh Ipl Iqc Irw Isd Itf Iur Ivz Iwp Ixb Iyo Izs
Jap Jbl Jcx Jdf Jec Jfw Jgo Jhs Jib Jjk Jkm Jlj Jmh Jnu Jov Jpr Jqd Jrz Jsy Jtn Jue Jva Jwq Jxt Jyi Jzg
Kaz Kba Kcv Kdi Kes Kfh Kgw Khy Kiu Kjf Kkt Klo Kmg Knr Kox Kpk Kqm Kre Ksp Ktq Kub Kvj Kwc Kxl Kyd Kzn
Lac Lby Lco Ldp Lea Lfm Lgv Lht Lil Ljn Lkj Llg Lmz Lnf Lor Lpq Lqu Lrs Lsx Ltw Lui Lvh Lwb Lxe Lyk Lzd
Mag Mbm Mcj Mdq Mex Mfz Mgh Mhb Mio Mjd Mkf Mlr Mma Mnt Mon Mpy Mql Mrp Mse Mtu Muw Mvc Mwk Mxv Mys Mzi
Nah Nbu Nct Ndb Nel Nfp Ngf Nhi Niv Njr Nkx Nlq Nme Nnk Nos Npc Nqg Nro Nsj Ntd Nua Nvn Nwz Nxw Nym Nzy
Oaf Obo Ocl Odx Oer Ofs Ogz Ohp Oiy Oji Okw Ola Omq Onv Oom Opu Oqb Ord Osn Otc Ouk Ovg Owe Oxh Oyt Ozj
Pam Pbq Pcu Pdy Pej Pfx Pgc Phg Pin Pjw Pkd Plh Pmv Pnp Pot Ppb Pqa Pri Psk Pts Puo Pvr Pwf Pxz Pyl Pze
Qai Qbd Qcn Qdt Qeu Qfg Qgy Qhe Qif Qjz Qkv Qlk Qmo Qnj Qoq Qpx Qqw Qrb Qsh Qtr Qup Qvm Qws Qxa Qyc Qzl
Raj Rbc Rcf Rda Red Rfu Rge Rhr Rit Rjy Rkg Rlw Rmk Rnz Rol Rpm Rqi Rrn Rsb Rth Ruq Rvp Rwo Rxs Ryv Rzx
Sat Sbj Scp Sdv Seo Sfr Sga Shw Siq Sjx Skn Sld Smb Snh Sou Sps Sqe Srl Ssc Stm Suz Svy Swg Sxi Syf Szk
Tan Tbt Tck Tdg Teq Tfo Tgu Thx Tiw Tjv Tkz Tlm Tmy Tne Toi Tpf Tqs Trc Tsa Ttj Tul Tvd Twr Txp Tyh Tzb
Uar Ubn Ucb Udc Uef Ufj Ugl Uha Uih Ujg Uks Ulu Umi Unq Uoy Upw Uqz Urv Usm Ute Uux Uvt Uwd Uxk Uyp Uzo
Vao Vbz Vcw Vdm Vep Vfi Vgj Vhv Vie Vjl Vkr Vlc Vmf Vny Vok Vpd Vqx Vrt Vsq Vtb Vuh Vvs Vwn Vxg Vya Vzu
Wal Wbg Wci Wdo Weh Wfd Wgt Whc Wim Wje Wkp Wls Wmn Wnb Woa Wpj Wqk Wrx Wsr Wty Wuf Wvq Wwv Wxu Wyz Wzw
Xaq Xbi Xcy Xdh Xem Xfl Xgn Xhk Xiz Xjb Xku Xlx Xmp Xnd Xog Xpe Xqo Xra Xsf Xtv Xus Xvw Xwt Xxr Xyj Xzc
Yad Ybe Ycr Yds Yew Yfv Ygk Yhz Yic Yjo Ykl Yln Ymx Yna Yob Yph Yqt Yrq Ysu Ytp Yuj Yvi Ywy Yxf Yyg Yzm
Zav Zbx Zcg Zdw Zey Zfe Zgr Zhu Zip Zjs Zkq Zlb Zml Znc Zoz Zpa Zqf Zrm Zst Zto Zud Zvk Zwi Zxj Zyn Zzh

We can double check we got our constraints right by performing the full pairwise check using the StringDistances.jl package:

using StringDistances

function verify_DL_constraint(names::Vector{String}; min_dist::Int = 2)
    dist = DamerauLevenshtein()
    lowers = lowercase.(names)
    N = length(lowers)
    for i in 1:N, j in 1:N
        i == j && continue
        if dist(lowers[i], lowers[j]) < min_dist
            return false
        end
    end
    return true
end

And indeed, our list checks out:

julia> verify_DL_constraint(names)
true

Notes

Here I’ve assumed names had to only include letters, but Julia package names can include numbers. Additionally, the DL distance constraint is a guideline aimed at preventing typosquatting, not a hard rule. So the 676 number here is more illustrative than definitive.
Both the initial draft of the code in this post and the idea to use a SAT solver were from ChatGPT 5.1 thinkingIt got the SAT constraints wrong 3 times though! Kind of annoying, but it was still a good idea that I didn’t think of.. I edited the code to fix some things and clean it up a little. None of the non-code text was LLM generated or edited.