Computing Nonsimple Polygons of Minimum Perimeter

We provide exact and approximation methods for solving a geometric relaxation of the Traveling Salesman Problem (TSP) that occurs in curve reconstruction: for a given set of vertices in the plane, the problem Minimum Perimeter Polygon (MPP) asks for a (not necessarily simply connected) polygon with shortest possible boundary length. Even though the closely related problem of finding a minimum cycle cover is polynomially solvable by matching techniques, we prove how the topological structure of a polygon leads to NP-hardness of the MPP. On the positive side, we show how to achieve a constant-factor approximation. When trying to solve MPP instances to provable optimality by means of integer programming, an additional difficulty compared to the TSP is the fact that only a subset of subtour constraints is valid, depending not on combinatorics, but on geometry. We overcome this difficulty by establishing and exploiting additional geometric properties. This allows us to reliably solve a wide range of benchmark instances with up to 600 vertices within reasonable time on a standard machine. We also show that using a natural geometry-based sparsification yields results that are on average within 0.5% of the optimum.


Introduction
Two of the most fundamental structures in Computational Geometry are planar point sets and polygons. In this paper we study a natural algorithmic connection between them. For Journal of Computational Geometry jocg.org a given set V of points in the plane, consider the family of all polygons with holes that have vertex set V . Such a polygon P consists of an exterior boundary that surrounds a collection of interior holes, which are simple disjoint polygonal boundaries with disjoint interior; note that each boundary must contain at least three vertices in order to be non-degenerate.
The Minimum Perimeter Polygon Problem (MP3) asks for a polygon P with holes on vertex set V , such that the total boundary length is smallest possible. As can be seen from Figure 1, an optimal solution for the MP3 need not be simply connected, but may consist of an outer boundary that surrounds a number of holes, i.e., interior boundaries. If holes are disallowed, the problem turns into the well-known Traveling Salesman Problem (TSP): find a shortest polygonal chain through a given set of vertices in the plane. As a consequence of the triangle inequality, any optimal solution of the TSP is always a simple polygon of minimum perimeter.
The TSP is one of the classic problems of Combinatorial Optimization. NP-hard even in special cases of geometric instances (such as grid graphs), it has served as one of the prototypical testgrounds for developing outstanding algorithmic approaches. These include constant-factor approximation methods (such as Christofides' 3/2-approximation [7] for metric instances, or Arora's [4] and Mitchell's [23] polynomial-time approximation schemes for geometric instances), as well as exact methods (such as Grötschel's optimal solution to a 120-city instance [16] or the award-winning work by Applegate, Bixby, Chvátal and Cook [2] for solving a 13509-city instance within 10 years of CPU time.) The well-established benchmark library TSPLIB [26] of TSP instances has become so widely accepted that it is used as a benchmark for a large variety of other optimization problems. See the books [17,21] for an overview of various aspects of the TSP and the books [3,8] for more details on exact optimization.
Because of the fundamental role of polygons in geometry, the study of TSP solutions has attracted attention for a wide range of geometric applications. One such context is Journal of Computational Geometry jocg.org geometric shape reconstruction, where the objective is to re-compute the original curve from a given set of sample points V ; see Giesen [15], Althaus and Mehlhorn [1] or Dey, Mehlhorn and Ramos [10] for specific examples. However, this only makes sense when the original shape is known to be simply connected, i.e., bounded by a single closed curve. More generally, a shape may be multiply connected, with interior holes. Thus, computing a simple polygon may not yield the desired answer. Instead, the solution may be a Minimum Perimeter Polygon (MPP) on vertex set V . See Figure 1 for an optimal solution of an instance with 960 points; this also shows the possibly intricate structure of an MPP.
While the MP3 asks for a cycle cover of the given set of vertices (as opposed to a single cycle required by the TSP), it is important to note that even the more general geometry of a polygon with holes imposes some topological constraints on the structure of boundary cycles; as a consequence, an optimal 2-factor (a minimum-weight cycle cover of the vertices, which can be computed in polynomial time) may not yield a feasible solution. Fekete et al. [12] gave a generic integer program for the MP3 (and other related problems) that yields optimal solutions for instances up to 50 vertices. However, the main challenges were left unresolved. What is the complexity of computing an MP3? Is it possible to develop constant-factor approximation algorithms? And how can we compute provably optimal solutions for instances of relevant size?

Our Results
In this paper, we resolve the main open problems related to the MP3.
• We prove that the MP3 is NP-hard. This shows that despite of the relationship to the polynomially solvable problem of finding a minimum 2-factor, dealing with the topological structure of the involved cycles is computationally difficult.
• We give a 3-approximation algorithm for the MP3.
• We provide a general IP formulation with O(n 2 ) variables to ensure a valid solution for the MP3.
• We describe families of cutting planes that significantly reduce the number of iterations needed to eliminate outer components and holes in holes, leading to a practically useful formulation.
• We present experimental results for the MP3, solving instances with up to 1000 points in the plane to provable optimality within 30 minutes of CPU time.
• We also consider a fast heuristic that is based on geometric structure, restricting the edge set to the Delaunay triangulation. Experiments on structured random point sets show that solutions are on average only about 0.5% worse than the optimum, with vastly superior runtimes. Proof. The proof is based on a reduction from the Minimum Vertex Cover problem for planar graphs, which was proven to be NP-complete by Garey and Johnson [14]: for an undirected planar graph G = (V, E) and a parameter k ∈ N, decide whether there exists a subset V ⊂ V of at most k vertices such that for every edge (u, v) ∈ E, at least one of u or v is in V . Given an instance I MVC of the Minimum Vertex Cover problem we construct an instance I MP3 of the MP3 such that I MP3 has a solution if and only if I MVC has a solution. Given a planar graph G, we replace its vertices with vertex gadgets, connect them with edge gadgets, and add three points at the vertices of a large triangle enclosing the construction. The triangle delimits the outer boundary of the polygon in the instance of the MP3, and the vertex and edge gadgets enforce a choice of cycles covering the points that form the holes of the polygon.
Vertex gadget. The vertex gadget consists of four points (refer to Figure 2). The top three points are always connected by a cycle. If the fourth point p is in the same cycle, that represents putting the corresponding vertex in subset V . The cycle's length is Figure 2: Vertex gadget. Left: p ∈ V , total length is 2b + 2ε; right: p / ∈ V , total length is 3ε.
Edge gadget. The edge gadget consists of a repeating pattern of four points forming a rhombus (refer to Figure 3). Let some edge gadget consist of r rhombi. There are three ways of covering all the points except for, possibly, the two outermost points, with cycles of total length at most 2ra + rε (see Figure 3 (a-c)). This will leave either the leftmost point, either the rightmost point, or both, the leftmost and the rightmost points, uncovered by the cycles. If we require both outermost points to be covered by the cycles, their total length is at least 2(r + 1)a + (r − 1)ε (see Figure 3 (d)). The points of the edge gadget could potentially be covered by a path of length 2ra + rε (see Figure 3 (e)) that closes into a cycle through other gadgets. To prevent this situation we add triplets of points that form small holes in the middle of each face of G. If a cycle would pass through an edge gadget, then the cycle would enclose at least one face of the graph G and thereby also enclose another hole, which is forbidden. Figure 3: Edge gadget. (a)-(c) the gadget is covered by cycles of total length ≈ 10a + 5ε; (d) total length 12a + 4ε; (e) the gadget is covered by a path of total length 10a + 5ε.
Split gadget. The split gadget (refer to Figure 4) multiplies the connection to a vertex gadget, thus allowing us to connect one vertex gadget to multiple edge gadgets. If point p is covered by the vertex gadget, all the points, including points p 1 and p 2 , of the split gadget can be covered by cycles of total length 16a + 11ε. If point p is not covered by the vertex gadget, p and all the points of the split gadget, except for p 1 and p 2 , can be covered by cycles of total length 16a + 11ε. Notice, that the cycles can only consist of the edges that are shown in the figure (with solid or dashed lines). There is always the same number of edges used in any collection of cycles that cover the same number of points. Therefore, if some cycle contains an edge that is longer than a, the other edges in the cycles have to be shorter to compensate for the extra length. By a simple case distinction one can show that there is no collection of cycles of length at most 16a + 11ε that covers the same points of the split gadget and that uses any edge that is not shown in Figure 4.
If we require the split gadget to cover points p 1 and p 2 when point p is not covered by the vertex gadget, the total length of the cycles is at least 18a + 10ε (see Figure 5). To summarize, given an embedding of planar graph G = (V, E) with n vertices and m edges, we construct an instance of the MP3 by replacing the vertices of the graph with the vertex gadgets, attaching deg(v)−1 split gadgets (where deg(v) denotes the degree of vertex v) to the corresponding vertex gadget of every vertex v, and connecting the vertex gadgets by edge gadgets (see Figure 6). We enclose the construction in a triangle of a very large size, that will form the outer boundary of the polygon. Let the perimeter of the triangle T be than the diameter of G. The cycles covering the points of the gadgets are the holes in the polygon. Moreover, to every face of G we add triplets of points forming cycles of a very small length ε. This eliminates any possibility of passing through edge gadgets with a single cycle.
The number of vertex gadgets used in the construction is n, and the number of split gadgets is v∈V deg(v)−n = 2m−n. Let the number of rhombi used in all the edge gadgets be r, and let the total length of the extra holes in the middle of the faces of G be ε. Then Let d be the length of the shortest edge. Choose a, b, and ε, such that ε b a d. Using standard graph embedding techniques, it is straightforward to see that all coordinates of this embedding are polynomial in the size of the original graph.
Then there is a polygon with perimeter at most L for I MP3 if and only if there is a vertex cover of size at most k for I MVC .
Let V be a vertex cover of size k of G = (V, E). Then, by selecting the corresponding vertex gadgets to cover points p, and propagating the construction of cycles along the split and edge gadgets, we get a polygon of perimeter L.
Let there exist a polygon P with perimeter at most T + 2(16m − 8n + r)a + 2kb + (22m−8n+r −k +1)ε. By construction, the outer boundary of P is the triangle of perimeter T . Suppose there are more than k vertex gadgets that are covering the corresponding points p. Then the perimeter of P has to be greater than T + 2(16m − 8n + r)a + 2kb + (22m − 8n + r − k + 1)ε, as the third term (of variable b) of the perimeter expression dominates the fourth term (of variable ε). Thus, there have to be no more than k variable gadgets that cover the corresponding points p. Every edge gadget has to have one of the end-points covered by the vertex gadgets (through split gadgets). Otherwise, the second term of the expression of the polygon perimeter would be greater. Therefore, the polygon corresponds to a vertex cover of size at most k for I MVC .

Approximation
In this section we show that the MP3 can be approximated within a factor of 3.

Theorem 2.
There exists a polynomial-time 3-approximation algorithm for the MP3.
Proof. Let OP T be the length of an optimal solution of the MP3 and AP X the length of the approximation that our algorithm will compute for the given set, V , of n points in the plane. We compute the convex hull, CH(V ), of the input set; this takes time O(n log h), where h is the number of vertices of the convex hull. Note that the perimeter, |CH(V )|, of the convex hull is a lower bound on the length of an optimal solution (OP T ≥ |CH(V )|), since the outer boundary of any feasible solution polygon must enclose all points of V , and the convex hull is the minimum-perimeter enclosure of V .
Let U ⊆ V be the input points interior to CH(V ). If U = ∅, then the optimal solution is given by the convex hull. If |U | ≤ 2, we claim that an optimal solution is a simple, not necessarily convex, polygon, with no holes, on the set V , given by the TSP tour on V ; since |U | = 2 is a constant, it is easy to compute the optimal solution in polynomial time, by trying all O(h 2 ) possible ways of inserting the points of U into the cycle of the points of V that lie on the boundary of the convex hull, CH(V ).

Journal of Computational Geometry
jocg.org Figure 7: A 2-factor (left) and its corresponding nesting forest (right).
Thus, assume now that |U | ≥ 3. We compute a minimum-weight 2-factor (i.e., a a minimum-weight cycle cover of the vertices), denoted by γ(U ), on U , which is done in polynomial-time by standard methods [9]. (The time required is that of solving a minimumweight matching in a bipartite graph having O(|U |) nodes and O(|U | 2 ) edges; this can be done in time O(|U | 3 ).) Now, γ(U ) consists of a set of disjoint simple polygonal curves having vertex set U ; the curves can be nested, with possibly many levels of nesting. We let F denote the directed nesting forest whose nodes are the cycles, i.e., the connected components of γ(U ), and whose directed edges indicate nesting (i.e., containment) of one cycle within another; refer to Figure 7. Since an optimal solution consists of a 2-factor (an outer cycle, together with a set of cycles, one per hole of the optimal polygon), we know that OP T ≥ |γ(U )|. In an optimal solution, the nesting forest corresponding to the set of cycles covering all of V , not just the points U interior to CH(V ), is simply a single tree that is a star: a root node corresponding to the outer cycle, and a set of children adjacent to the root node, corresponding to the boundaries of the holes of the optimal polygon. If the nesting forest F for our optimal 2-factor is a set of isolated nodes (i.e., there is no nesting among the cycles of the optimal 2-factor on U ), then our algorithm outputs a polygon with holes whose outer boundary is the boundary of the convex hull, CH(V ), and whose holes are the disjoint polygons given by the cycles of γ(U ). In this case, the total weight of our solution is equal to Assume now that F has at least one nontrivial tree. We describe a two-phase process that transforms the set of cycles corresponding to F into a set of pairwise-disjoint cycles, each defining a simple polygon interior to CH(V ), with no nesting. The resulting simple polygons are disjoint, each having at least 3 vertices from U ⊂ V . Phase 1 of the process transforms the cycles γ(U ) into a set of polygonal cycles that define weakly simple polygons whose interiors are pairwise disjoint, where a polygonal cycle β defines a weakly simple polygon P β if P β is a closed, simply connected set in the plane with a boundary, ∂P β consisting of a finite union of line segments, whose traversal (e.g., while keeping the region P β to one's left) is the counterclockwise cycle β, which can have line segments that are traversed twice, once in each direction. (The notion of a "weakly simple" polygon can have various meanings, which may be slightly different from that used Journal of Computational Geometry jocg.org here; we refer the reader to [5], which includes algorithmic results as well.) The total length of the cycles at the end of phase 1 is at most 2 times the length of the original cycles, γ(U ). Then, phase 2 of the process transforms these weakly simple cycles into (strongly) simple cycles that define disjoint simple polygons interior to CH(V ). Phase 2 only does shortening operations on the weakly simple cycles; thus, the length of the resulting simple cycles at the end of phase 2 is at most 2 times the total length of γ(U ). At the end of phase 2, we have a set of disjoint simple polygons within CH(V ), which serve as the holes of the output polygon, whose total perimeter length is at most We now describe phase 1. Let T be a nontrivial tree of F . Associated with T are a set of cycles, one per node. A node u of T that has no outgoing edge of T (i.e., U has no children) is a sink node; it corresponds to a cycle that has no cycle contained within it. Let v be a node of T that has at least one child, but no grandchildren; clearly, such a node must exist in a nontrivial tree T . Then, v corresponds to a cycle (simple polygon) P v , within which there is one or more disjoint simple polygonal cycles, P u 1 , P u 2 , . . . , P u k , one for each of the k ≥ 1 children of v. We describe an operation that replaces P v with a new weakly simple polygon, Q v , whose interior is disjoint from those of P u 1 , P u 2 , . . . , P u k . Let e = pq (p, q ∈ V ) be any edge of P v ; assume that pq is a counterclockwise edge, so that the interior of P v lies to the left of the oriented segment pq. Let Γ be a shortest path within P v , from p to q, that has all of the polygons P u 1 , P u 2 , . . . , P u k to its right; thus, Γ is a "taut string" path within P v , homotopically equivalent to ∂P v , from p to q. Such a geodesic path is related to the "relative convex hull" of the polygons P u 1 , P u 2 , . . . , P u k within P v , which is the shortest cycle within P v that encloses all of the polygons; the difference is that Γ is "anchored" at the endpoints p and q. Note that Γ is a polygonal path whose vertices are either (convex) vertices of the polygons P u j or (reflex) vertices of P v . The path Γ can be computed in linear (O(|V |)) time [18], after triangulating the domain. Consider the closed polygonal walk that starts at p, follows the path Γ to q, then continues counterclockwise around the boundary, ∂P v , of P v until it returns to p. This closed polygonal walk is the counterclockwise traversal of a weakly simple polygon, Q v , whose interior is disjoint from the interiors of the polygons P u 1 , P u 2 , . . . , P u k . Refer to Figure 8. The length of this closed walk (the counterclockwise traversal of the boundary of Q v ) is at most twice the perimeter of P v , since the path Γ has length at most that of the counterclockwise boundary ∂P v , from q to p, because Γ is a homotopically equivalent shortening of this boundary. We consider the boundary of P v to be replaced with the cycle around the boundary of Q v , and this process has reduced the degree of nesting in T : node v that used to have k children (leaves of T ) is now replaced by a node v corresponding to Q v , and v and the k children of v are now all siblings in the modified tree, T . If v had a parent, w, in T , then v and the k children of v are now children of W ; if v had no parent in T (i.e., it was the root of T ), then T has been transformed into a set of k + 1 cycles, none of which are nested within another cycle of γ(U ); each is within the convex hull CH(V ), but there is no other surrounding cycle of γ(U ). We continue this process of transforming a surrounding parent cycle (node v) into a sibling cycle (node v ), until each tree T of F becomes a set of isolated nodes, and finally F has no edges, i.e., there is no nesting.
Phase 2 is a process of local shortening of the cycles/polygons, Q 1 , Q 2 , . . . , Q m , that resulted from phase 1, in order to remove repeated vertices in the weakly simple cycles, so that cycles become strongly simple. There are two types of repeated vertices to resolve: those that are repeated within the same cycle, i.e., repeated vertices p of a cycle Q i where ∂Q i "pinches" upon itself, and those that are repeated across different cycles, i.e., vertices p where one cycle is in contact with another, both having vertex p.
Consider a weakly simple polygon Q, and let p be a vertex of Q that is repeated in the cycle specifying the boundary ∂Q. This implies that there are four edges of the (counterclockwise) cycle, p 0 p, pp 1 , p 2 p, and pp 3 , incident on p, all of which lie within a halfplane through p (by local optimality). There are then two subcases: (i) p 0 , p, p 1 is a left turn (Figure 9, left); and (ii) p 0 pp 1 is a right turn (Figure 9, right). In subcase (i), p 0 p, pp 1 define a left turn at p (making p locally convex for Q), and p 2 p, pp 3 define a right turn at p (making p locally reflex for Q). In this case, we replace the pair of edges p 0 p, pp 1 with a shorter polygonal chain, namely the "taut" version of this path (homotopically equivalent to it), from p 0 to p 1 , along a shortest path, β 0,1 , among the polygons Q i , including Q, treating them as obstacles. The taut path β 0,1 is computed in linear time and consists of left turns only, at (locally convex) vertices of polygons Q i (Q i = Q) or (locally reflex) vertices of Q, where new pinch points of Q are created. Refer to Figure 9, left. Case (ii) is treated similarly; see Figure 9, right. Thus, resolving one repeated vertex, p, of Q can result in the creation of other repeated vertices of Q, or repeated vertices where two cycles come together (discussed below). The process is finite, though, since the total length of all cycles strictly decreases with each operation; in fact, there can be only a polynomial number (O(n 3 )) of such adjustments, since each triple (p 0 , p, p 1 ), is resolved at most once. Now consider a vertex p that appears once as a reflex vertex in Q 1 (with incident ccw edges p 0 p and pp 1 ) and once as a convex vertex in Q 2 (with incident ccw edges p 2 p and pp 3 ). This is because cycles resulting after phase 1 are locally shortest, p must be reflex in one cycle and convex in the other. Our local operation in this case results in a merging of the two cycles Q 1 and Q 2 into a single cycle, replacing edges p 0 p (of Q 1 ) and pp 3 (of Q 2 ) with the taut shortest path, β 0,3 . As in the process described above, this replacement can result in new repeated vertices, as the merged cycle may come into contact with other cycles, or with itself.

Journal of Computational Geometry
jocg.org

Journal of Computational Geometry jocg.org
Finally, the result of phase 2, is a set of disjoint cycles, with no repeated vertices, defining disjoint simple polygons within CH(V ); these cycles define the holes of the output polygon, whose total perimeter length is at most that of CH(V ), plus twice the lengths of the cycles γ(U ) in an optimal 2-factor of the interior points U . Thus, we obtain a valid solution with objective function at most 3 times optimal. The total running time is polynomial; a straightforward implementation takes time O(n 4 ), but this time bound can likely be improved substantially. 4 IP Formulation

Cutting-Plane Approach
In the following we develop suitable Integer Programs (IPs) for solving the MP3 to provable optimality. The basic idea is to use a binary variable x e ∈ {0, 1} for any possible edge e ∈ E, with x e = 1 corresponding to e being part of a solution P if and only if x e = 1. The objective is then to min e∈E x e c e , where c e is the length of e. In addition, we impose a suitable set of linear constraints on these binary variables, such that they characterize precisely the set of polygons with vertex set V . The challenge is to pick a set of constraints that achieve this in a (relatively) efficient manner.
As it turns out (and is discussed in more detail in Section 5), there is a significant set of constraints that correspond to eliminating cycles within proper subsets S ⊂ V . Moreover, there is an exponential number of relevant subsets S, making it prohibitive to impose all of these constraints at once. The fundamental idea of a cutting-plane approach is that much fewer constraints are necessary for characterizing an optimal solution. To this end, only a relatively small subfamily of constraints is initially considered, leading to a relaxation. As long as solving the current relaxation yields a solution that is infeasible for the original problem, violated constraints are added in a piecemeal fashion, i.e., in iterations.
In the following, these constraints (which are initially omitted, violated by an optimal solution of the relaxation, then added to eliminate such infeasible solutions) are called cutting planes or simply cuts, as they remove solutions of a relaxation that are infeasible for the MP3.

Basic IP
We start with a basic IP that is enhanced with specific cuts, described in Sections 5.2-5.4. We denote by E the set of all edges between two points of V , we denote by C a set of invalid cycles, and we denote by δ(v) the set of all edges in E that are incident to v ∈ V . Then we optimize over the following objective function: Journal of Computational Geometry jocg.org This is subject to the following constraints: x e ∈ {0, 1} .
For the TSP, C is simply the set of all subtours, making identification and separation straightforward. This is much harder for the MP3, where a subtour may end up being feasible by forming the boundary of a hole, but may also be required to connect with other cycles. Therefore, identifying valid inequalities requires more geometric analysis, such as the following. If we denote by CH the set of all convex hull points, then a cycle C is invalid if C contains: 1. at least one and at most |CH| − 1 convex hull points. (See Figure 11 For an invalid cycle with property 1, we use the equivalent cut constraint We use constraint (3) if |C| ≤ 2n+1 3 and constraint (5) otherwise, where δ(C) denotes the "cut" edges connecting a vertex v ∈ C with a vertex v ∈ C. As argued by Pferschy and Stanek [25], this technique of dynamic subtour constraints (DSC) is useful, as it reduces the number of non-zero coefficients in the constraint matrix.

Journal of Computational Geometry
jocg.org

Initial Edge Set
In order to quickly achieve an initial solution, we sparsify the Θ(n 2 ) input edges to the O(n) edges of the Delaunay Triangulation, which naturally captures geometric nearest-neighbor properties. If a solution exists, this yields an upper bound. This technique has already been applied for the TSP by Jünger et al. [19]. In theory, this may not yield a feasible solution: a specifically designed example by Dillencourt shows that the Delaunay triangulation may be non-Hamiltonian [11]; this same example has no feasible solution for the MP3 when restricted to Delaunay edges. We did not observe this behavior in practice.
CPLEX uses this initial solution as an upper bound, allowing it to quickly discard large solutions in a branch-and-bound manner. As described in Section 6, the resulting bounds are quite good for the MP3. 5 Separation Techniques

Pitfalls
When separating infeasible cycles, the Basic IP may get stuck in an exponential number of iterations, due to the following issues. (See Figures 12-14 for illustrating examples.) Problem 1: Multiple outer components containing convex hull points occur that (despite the powerful subtour constraints) do not get connected, because it is cheaper to, e.g., integrate subsets of the interior points. Such an instance can be seen in Figure 12, where we have two equal components with holes. Since the two components are separated by a distance greater than the distance between their outer components and their interior points, the outer components start to include point subsets of the holes. This results in a potentially exponential number of iterations.
Problem 2: Outer components that do not contain convex hull points do not get integrated, because we are only allowed to apply a cycle cut on the outer component containing the convex hull points. An outer component that does not contain a convex hull point cannot be prohibited, as it may become a hole in later iterations. See Figure 13 for an example in which an exponential number of iterations is needed until the outer components get connected.
Problem 3: If holes contain further holes, we are only allowed to apply a cycle cut on the outer hole. This outer hole can often cheaply be modified to fulfill the cycle cut but not resolve the holes in the hole. An example instance can be seen in Figure 14, in which an exponential number of iterations is needed.
The second problem is the most important, as this problem frequently becomes critical on instances of size 100 and above. Holes in holes rarely occur on small instances but are problematic on instances of size > 200. The first problem occurs only in a few instances.

Journal of Computational Geometry
jocg.org Figure 12: (a) -(f) show consecutive iterations when trying to solve an instance using only constraint (5). Figure 13: (a) -(g) show consecutive iterations when trying to solve an instance using only constraint (3). Figure 14: (a) -(g) show consecutive iterations when trying to solve an instance using only constraint (3).
In the following we describe three cuts that each solve one of the problems: The glue cut for the first problem in Section 5.2, the tail cut for the second problem in Section 5.3, and the HiH-Cut for the third problem in Section 5.4.

Journal of Computational Geometry
jocg.org (a) (b) Figure 15: Solving instance from Figure 12 with a glue cut (red). (a) The red curve needs to be crossed at least twice; it is found using the Delaunay Triangulation (grey). (b) The first iteration after using the glue cut.

Glue Cuts
To separate invalid cycles of property 1 we use glue cuts (GC), based on a curve R D from one unused convex hull edge to another (see Figure 15). With X (R D ) denoting the set of edges crossing R D , we can add the following constraint: Such curves can be found by considering a constrained Delaunay triangulation [6] of the current solution, performing a breadth-first-search starting from all unused convex hull edges of the triangulation. Two edges are adjacent if they share a triangle. Used edges are excluded, so our curve will not cross any used edge. As soon as two different search trees meet, we obtain a valid curve by using the middle points of the edges (see the red curve in Figure 15).
For an example, see Figure 15; as illustrated in Figure 12, this instance is problematic in the Basic IP. This can we now be solved in one iteration.

Tail Cuts
An outer cycle C that does not contain any convex hull points cannot simply be excluded, as it may become a legal hole later. Such a cycle either has to be merged with others, or become a hole. For a hole, each curve from the hole to a point outside of the convex hull must be crossed at least once.
With this knowledge we can provide the following constraint, making use of a special curve, which we call a tail (see the red path in Figure 16).
Let R T be a valid tail and X (R T ) the edges crossing it. We can express the constraint in the following form: The tail is obtained in a similar fashion as the curves of the Glue Cuts by building a constrained Delaunay triangulation and doing a breadth-first search starting at the edges Journal of Computational Geometry jocg.org For an example, see Figure 16; as illustrated in Figure 13, this instance is problematic in the Basic IP. This can we now be solved in one iteration. Note that even though it is possible to cross the tail without making the cycle a hole, this is more expensive than simply merging it with other cycles.

Hole-in-Hole Cuts
The difficulty of eliminating holes in holes (Problem 3) is that they may end up as perfectly legal simple holes, if the outer cycle gets merged with the outer boundary. In that case, every curve from the hole to the convex hull cannot cross the used edges exactly two times (edges of the hole are ignored). One of the crossed edges has to be of the exterior cycle, while the other one cannot: otherwise would again leave the polygon. It also cannot be of an interior cycle, as it would have leave to leave that cycle again to reach the hole.
Therefore the inner cycle of a hole in hole either has to be merged, or all curves from it to the convex hull do not have exactly two used edge crossings. As it is impractical to argue over all curves, we only pick one curve P that currently crosses exactly two used edges (see the red curve in Figure 17 with crossed edges in green).
Because we cannot express the inequality that P is not allowed to be crossed exactly two times as an linear programming constraint, we use the following weaker observation. If the cycle of the hole in hole becomes a simple hole, the crossing of P has to change. Let e 1 and e 2 be the two used edges that currently cross P and X (P ) the set of all edges crossing P (including unused but no edges of H). We can express a change on P by Together we obtain the following LP constraint for either H being merged or the crossing of P changing.
Again we use a breadth-first search on the constrained Delaunay triangulation starting from the edges of the hole in hole. Unlike the other two cuts we need to cross used edges. Thus, we get a shortest path search such that the optimal path primarily has a minimal number of used edges crossed and secondarily has a minimal number of all edges crossed.
For an example, see Figure 17; as illustrated in Figure 12, this instance is problematic in the Basic IP. This can now be solved in one iteration. The corresponding path is displayed in red and the two crossed edges are highlighted in green. Changing the crossing of the path is more expensive than simply connecting the hole in hole to the outer hole and thus the hole in hole gets merged. 6 Experiments

Implementation
Our implementation uses CPLEX to solve the relevant IPs. Important is also the geometric side of computation, for which we used the CGAL Arrangements package [27]. CGAL represents a planar subdivision using a doubly connected edge list (DCEL), which is ideal for detecting invalid boundary cycles.

Test Instances
While the TSPLIB is well recognized and offers a good mix of instances with different structure (ranging from grid-like instances over relatively uniform random distribution to Journal of Computational Geometry jocg.org highly clustered instances), it is rather sparse. Observing that the larger TSPLIB instances are all geographic in nature, we designed a generic approach that yields arbitrarily large and numerous clustered instances. This is based on illumination maps: A satellite image of a geographic region at night time displays uneven light distribution. The corresponding brightness values can be used as a random density function that can be used for sampling (see Figure 20). To reduce noise, we cut off brightness values below a certain threshold, i.e., we set the probability of choosing the respective pixels to zero.
Journal of Computational Geometry jocg.org   Average gap Figure 21: The relative gap of the value on the edges of the Delaunay triangulation to the optimal value. The red area marks the range between the minimal and maximal gap.
We observe that even without using glue cuts and jumpstart, we are able to solve more than 50% of the instances up to about 550 input points. Without the tail cuts, we hit a wall at 100 points, without the HiH-cut instances, at about 370 input points; see Figure 18, which also shows the average runtime of all 30 instances for all variants. Instances exceeding the 30 minutes time limit are marked with a 30-minutes timestamp. The figure shows that using jumpstart shortens the runtime significantly; using the glue cut is almost as fast as the variant without the glue cut. Figure 19 shows that medium-sized instances (up to about 450 points) can be solved in under 5 minutes. We also show that restricting the edge set to the Delaunay triangulation edges yields solutions that are about 0.5% worse on average than the optimal solution. Generally the solution of the jumpstart gets very close to the optimal solution until about 530 points. After that, for some larger instances, we get solutions on the edge set of the Delaunay triangulation that are up to 50% worse than the optimal solution.

Conclusions
As discussed in the introduction, considering general instead of simple polygons corresponds to searching for a shortest cycle cover with a specific topological constraint: one outside cycle surrounds a set of disjoint and unnested inner cycles. Clearly, this is only one example of considering specific topological constraints. Our techniques and results should be applicable, after suitable adjustments, to other constraints on the topology of cycles. We gave a 3-approximation for the MP3; it may be that the approximation can be improved, e.g" based on extending known PTAS techniques for TSP [4,23] to account for the topological constraints.
Journal of Computational Geometry jocg.org There are also various practical aspects that can be explored further. It will be interesting to evaluate the practical performance of the theoretical approximation algorithm, not only from a practical perspective, but also to gain some insight on whether the approximation factor of 3 can be tightened. Pushing the limits of solvability can also be attempted, e.g., by using more advanced techniques from the TSP context. We can also consider sparsification techniques other than the Delaunay edges; e.g., the union between the best known tour and the k-nearest-neighbor edge set (k ∈ {2, 5, 10, 20}) has been applied for TSP by Land [20], or (see Padberg and Rinaldi [24]) by taking the union of k tours acquired by Lin's and Kernighan's heuristic algorithm [22].