# Three nice techniques

In the past months, I’ve found myself really appreciating some nice techniques for constructing algorithms. These are probably quite familiar to those with a computer science background, but are new to me. There are three in particular that I have in mind; I’ll just highlight the first two here, and then discuss the third in detail.

The first such technique I learned about was solving the traveling salesman
problem via mixed-integer programming, using lazy constraints. In essence, you
encode a solution to the traveling salesman problem (a tour) as a permutation
(of the cities to visit), which you represent with a binary matrix. Then you
minimize the cost of the tour over all permutation matrices. The difficulty is
that this can yield “tours” that don’t aren’t full cycles: a permutation of
\((1,2,3,4)\) could be \(1\) goes to \(2\), \(2\) goes to \(1\), \(3\) goes to \(4\), and \(4\)
goes to \(3\). That’s actually just two transpositions instead of a single cycle
of all the cities. So instead, one needs to rule out all the permutations that
aren’t cycles. However, there doesn’t seem to be a good way to do that; one
could impose exponentially many linear constraints, one for each non-full-cycle,
for example, but that would be very inefficient (if you can even fit them all
into memory). But what you *can* do is solve the problem without such
constraints, and then just add constraints corresponding to any non-full-cycles
that you obtain in the solution. Then you re-solve the problem. That way, you
only need to impose the constraints that actually matter (which usually is far
fewer than the total number of constraints). I thought this was really cool! I
learned about it from the blog post *MIP - Travelling
Salesman* by Ole Kröger. To explore it for
myself, I wrote a Julia package
TravelingSalesmanExact.jl
to use this technique for solving the problem (using
JuMP.jl). In fact, shortly after coming
across this approach to the traveling salesman problem, I realized that I could
use the same technique to find good upper bounds on an optimization problem that
came up in my research (that project is still in progress, so I won’t go into
details here).

The second algorithmic technique is that of using outer approximations
iteratively to solve mixed-integer non-linear programming problems. Essentially,
one alternates between mixed-integer *linear* solves, and continuous non-linear
(though hopefully convex) solves. Each solution provides some information for
the next problem; the mixed-integer solve gives a new set of integer values to
use in the continuous problem, and the solution to the continuous problem
generates another linear constraint to use in the mixed-integer problem (kind of
like the lazy constraints in the TSP algorithm above). I learned about this from
the paper *Outer Approximation With Conic Certificates For Mixed-Integer Convex
Problems* which describes using the technique
to solve mixed-integer SDPs, and corresponds to the Julia package
Pajarito.jl. It’s also described well
in the review paper *Mixed-integer nonlinear
optimization*.

The third technique is that of dynamic programming. There are lots of explanations of dynamic programming out there, but the one that helped me understand was the following. Consider the task of computing the \(n\)th Fibonacci number, defined by \(f(n) = f(n-1) + f(n-2)\) for \(n >2\), and \(f(1) = f(2) = 1\).

With a naive recursive implementation, we have the following tree of function calls:

We can see the exponential complexity of this algorithm: to compute \(f(n+1)\)
instead of \(f(n)\), we add a whole subtree of calculations. However, instead we
can *memoize* the results, that is, cache them so we don’t need to recompute them.
For example, after we compute \(f(4)\) the first time, we don’t need to compute it
again. This results in the following call tree:

We can see that the algorithm has been linearized. To compute \(f(n+1)\), we don’t need to compute an extra whole subtree, and instead just do one more step. This kind of self-similarity of the problem allows us to reuse the work.

# Using dynamic programming to solve a covering problem

In the following, I’ll use Julia’s notation for ranges, by which I mean an
interval of integers. If `a`

and `b`

are integers, then `a:b`

means the set
\(\{a, a+1, a+2, \ldots, b\}\).

Our task is to partially covering a range of integers from a collection of
subranges. More specifically, given a range `r`

and a collection of ranges `C`

, we aim to find the smallest subcollection of `C`

that covers at least `K`

elements of `r`

.

This problem was posed by Jakob Nissen in the random channel of the Julia slack,
and in CoverRanges.jl I’ve
implemented a dynamic programming algorithm made by Mathieu Tanneau, which is
partly based on the solution to the case when `K = length(r)`

given here:
https://stackoverflow.com/a/294419.

In the following, I’ll show a couple examples of the problem and the solution
using `CoverRanges.jl`

, and then go through the actual code of the algorithm.

## A few examples

Let’s consider `julia r = 1:6`

, `C = [1:1, 2:5, 6:6, 1:3, 4:6]`

and `K = 6`

. This is
just the case of needing an 100% cover of the range. A naive greedy algorithm
might take the longest range `2:5`

. But then to cover `1`

and `6`

, two more
ranges would be needed. Instead, the optimum is to take the two ranges of length
`3`

, namely `1:3`

and `4:6`

.

```
julia> using CoverRanges
julia> r = 1:6
1:6
julia> C = [1:1, 2:5, 6:6, 1:3, 4:6];
julia> C[solve_min_covering(r, 6, C)]
2-element Array{UnitRange{Int64},1}:
1:3
4:6
```

We can see that’s exactly what we get. Let’s make it more interesting. We’ll
tile two copies of `C`

next to each other:

```
julia> append!(C, [r .+ 6 for r in C])
10-element Array{UnitRange{Int64},1}:
1:1
2:5
6:6
1:3
4:6
7:7
8:11
12:12
7:9
10:12
```

and let’s take `r=1:12`

. Now, if `K=8`

, one can just take the two largest ranges:

```
julia> r = 1:12
1:12
julia> C[solve_min_covering(r, 8, C)]
2-element Array{UnitRange{Int64},1}:
2:5
8:11
```

But for `K = 10`

, we’ll take the two length-3 ranges to cover the first six,
then the length 4 range to complete the partial cover:

```
julia> C[solve_min_covering(r, 10, C)]
3-element Array{UnitRange{Int64},1}:
1:3
4:6
8:11
```

whereas for `K=11`

, the best we can do is cover all 12 by four sets of length-3 ranges:

```
julia> C[solve_min_covering(r, 11, C)]
4-element Array{UnitRange{Int64},1}:
1:3
4:6
7:9
10:12
```

`CoverRanges.jl`

also provides an `AccessDict`

object, which is a dictionary that
keeps track of when its elements are retrieved or set. Note that only enough
methods are implemented for it to be used as a memoization cache in the
package; for other uses, more methods would likely be needed (e.g. `getindex`

and `setindex!`

). This can help provide more detailed information about how the
algorithm works.

```
julia> using CoverRanges: LeftToCover, PartialCover
julia> cache = AccessDict{LeftToCover{Int}, PartialCover{Int}}()
AccessDict{LeftToCover{Int64},PartialCover{Int64},Int64} with 0 entries
julia> solve_min_covering(r, 11, C; cache = cache)
4-element Array{Int64,1}:
4
5
9
10
julia> total_gets(cache)
7
julia> total_sets(cache)
22
```

## The algorithm in code

The rest of this section was actually generated from the source code itself, using Literate.jl to transform it into the version below.

The idea of the algorithm is that we can simply start from the left-most element
of the range `r`

that we wish to cover, and decide whether or not to include it
in our partial cover. If we do want to include it, the optimal range to use to
cover it is the one which extends the furthest to the right (out of all the
ranges which include the element). Then it remains to solve a smaller instance
of the problem. Likewise, if we don’t include the element, we again are left
with a smaller instance of the problem. In other words, we aim to solve the
dynamic programming equation

`min_number_to_cover(a:b, K) = min(1 + min_number_to_cover(f:b, K - (f-a+1)), min_number_to_cover((a+1):b, K))`

where
`min_number_to_cover(r, K)`

is the minimum number of ranges needed to cover at
least `K`

elements of `r`

and `f`

is the endpoint furthest to the right of any
range in `C`

that includes `a`

.

Why is this the right recursive equation for this problem? Let us say we reach
node `a`

. We can either

- include node
`a`

in our set of nodes covered, in which case we choose the range that includes`a`

and stretches as far as possible to the right; in that case, we have to solve the problem on the remaining set of nodes (this corresponds to`min_number_to_cover(f:b, K - (f-a+1))`

). - Otherwise, we skip node
`a`

and need to solve the problem with the remaining nodes`a+1:b`

, still needing to cover`K`

nodes (this corresponds to`min_number_to_cover( (a+1):b, K))`

).

Let’s begin the implementation.
First, we need an object to represent an impossible task, which is never chosen in a
minimum and which is invariant under adding `+1`

. Julia’s `Inf`

does this for
us, but `Inf`

is a floating point number, and we’d prefer to stick to integers
(with the exception of this infinity).

```
struct ComparisonInf end
const CompInf = ComparisonInf()
Base.isless(::ComparisonInf, ::ComparisonInf) = false
for T in (Int8, Int16, Int32, Int64, Int128)
@eval Base.isless(::ComparisonInf, ::$T) = false
@eval Base.isless(::$T, ::ComparisonInf) = true
@eval Base.:+(::$T, c::ComparisonInf) = c
@eval Base.:+(c::ComparisonInf, ::$T) = c
end
```

In the above list, we could add other types if needed, such as SaferIntegers.jl numeric types to ensure our algorithm does not overflow.

Next, we define an struct to hold an instance or subinstance of our problem.

```
struct LeftToCover{T}
"Range to partially cover"
r::UnitRange{T}
"Number of nodes we are required to cover"
K::T
end
```

And likewise, one to hold a solution (or part of one).

```
struct PartialCover{T}
"Minimum number needed to partially cover the range"
m::Union{T, ComparisonInf}
"Index of latest range added"
i::Int
"Key to point towards next range in the partial cover"
key::LeftToCover{T}
end
```

Next, to find the range with the furthest right endpoint including a given node, we just write a simple loop.

```
"""
find_furthest_range(a, C) -> index, right_endpoint
Given a node `a`, finds the range `r` in `C` such that `r` contains `a` and has
the largest right endpoint.
"""
function find_furthest_range(a, C)
# This could be optimized more by sorting `C` first and terminating early
best_f_so_far = typemin(a) # right endpoint
best_i_so_far = nothing # index
for (i, r) in enumerate(C)
a ∈ r || continue
if last(r) > best_f_so_far
best_f_so_far = last(r)
best_i_so_far = i
end
end
return best_i_so_far, best_f_so_far
end
```

Now comes the heart of the algorithm. We compute

`min_number_to_cover(a:b, K) = min(1 + min_number_to_cover(f:b, K - (f-a+1)), min_number_to_cover((a+1):b, K))`

recursively with the help of a memoization cache, which maps from (sub)instances of the problem to solutions.

```
function min_covering(key::LeftToCover{T}, furthest_ranges, cache::AbstractDict{LeftToCover{T}, PartialCover{T}}) where {T}
get!(cache, key) do
r = key.r; K = key.K
K <= 0 && return PartialCover{T}(zero(T), 0, key) # Nothing left to cover
K > length(r) && return PartialCover{T}(CompInf, 0, key) # Impossible task
a, b = first(r), last(r)
i_a, f_a = furthest_ranges[a] # index and right endpoint of optimal range including `a`
# Solve the subproblem in which we skip `a`
skip_cover = min_covering(LeftToCover{T}(a+1:b, K), furthest_ranges, cache)
i_a === nothing && return skip_cover # cannot cover `a` means we must skip it
# Solve the subproblem in which we include `a` in our partial cover
add_key = LeftToCover{T}( (f_a+1) : b, K - (f_a - a + 1))
add_cover = min_covering(add_key, furthest_ranges, cache)
# Decide whether or not to include `a` and return the better result
if add_cover.m + 1 <= skip_cover.m
return PartialCover{T}(add_cover.m + T(1), i_a, add_key)
else
return skip_cover
end
end
end
```

Note that in the case that we choose to skip `a`

, we return `skip_cover`

, which
is the solution for the subproblem that skips `a`

. But when we choose to include
`a`

in the partial cover, we make a new `PartialCover`

object instead of
returning `add_cover`

. That’s because if we include `a`

by using the range that
ends at `f_a`

, we really are adding a range to the partial cover represented by
`add_cover`

, so we make a new `PartialCover`

object, one whose value of `m`

is one
larger (corresponding to the extra range we added), which holds the index `i_a`

of the range that we added, and which holds `add_key`

, which we can use to
recover `add_cover`

from `cache`

later. In all, this lets us hold the sequence
of ranges used in the final optimal cover efficiently (as a linked list).

Lastly, we wrap `min_covering`

in a helper function that constructs `furthest_ranges`

,
the memoization cache, and unwinds the linked list that holds the solution in order
to recover the indices of the ranges included in the minimal partial cover we have found.

```
"""
solve_min_covering(r::UnitRange{T}, K, C; [cache]) -> Union{Nothing, Vector{Int}}
Returns either `nothing` or a vector of indices of `C` corresponding to a
minimal set of ranges which covers at least `K` elements of `r`. Returns
`nothing` if and only if the task is impossible.
* `r`: range to partially cover
* `K`: number of elements of `r` that must be
covered
* `C`: vector of ranges to use in constructing the partial cover
* `cache`: optionally provide an `AbstractDict{LeftToCover{T}, PartialCover{T}}`
which is used as a memoization cache
!!! warning
Overflow is not checked.
"""
function solve_min_covering(r::UnitRange{T}, K, C; cache = Dict{LeftToCover{T},
PartialCover{T}}()) where {T}
# `furthest_ranges` is all we need from `C`
furthest_ranges = Dict(a => find_furthest_range(a, C) for a in r)
# Recursive function to compute the covering
result = min_covering(LeftToCover{T}(r, K), furthest_ranges, cache)
k = result.m # minimum number of ranges needed
k == CompInf && return nothing
# We store the list of ranges essentially as a linked list.
# To recover the optimal list of ranges, we traverse through
# the cache, at each step getting the `key` for the next element in the cache.
range_inds = Int[]
sizehint!(range_inds, k)
while k > 1
push!(range_inds, result.i)
k = result.m
result = cache[result.key]
end
return range_inds
end
```