Time-Space Trade-Offs for Computing Euclidean Minimum Spanning Trees

In the limited-workspace model, we assume that the input of size $n$ lies in a random access read-only memory. The output has to be reported sequentially, and it cannot be accessed or modified. In addition, there is a read-write workspace of $O(s)$ words, where $s \in \{1, \dots, n\}$ is a given parameter. In a time-space trade-off, we are interested in how the running time of an algorithm improves as $s$ varies from $1$ to $n$. We present a time-space trade-off for computing the Euclidean minimum spanning tree (EMST) of a set $V$ of $n$ sites in the plane. We present an algorithm that computes EMST$(V)$ using $O(n^3\log s /s^2)$ time and $O(s)$ words of workspace. Our algorithm uses the fact that EMST$(V)$ is a subgraph of the bounded-degree relative neighborhood graph of $V$, and applies Kruskal's MST algorithm on it. To achieve this with limited workspace, we introduce a compact representation of planar graphs, called an $s$-net which allows us to manipulate its component structure during the execution of the algorithm.


Introduction
Given n sites in the plane, their Euclidean minimum spanning tree (EMST), is the minimum spanning tree with the sites as vertices, where the weight of the edge between two sites is their Euclidean distance.This problem is at the core of computational geometry and has been a classical problem taught in almost every first year lecture on the subject.Several classical algorithms are known that can compute EMST(V ) in O(n log n) time using O(n) words of workspace [11].
In this work, we revisit this problem, and design algorithms to compute the EMST in a memoryconstrained model, where only few extra variables are allowed to be used during the execution of the algorithm.This kind of algorithms not only provides an interesting trade-off between running time and memory needed, but also is very useful in portable devices where important hardware constraints are present.
A significant amount of research was focused on the design of algorithms using few variables.Many of them dating from the 1970s, when memory used to be an expensive commodity.While in recent days the cost has substantially been reduced, the amount of data has increased, and the size of some devices has been dramatically reduced.Sensors and small devices where larger memories are neither possible nor desirable have proliferated in recent years.In addition, when working on inputs that do not fit in the local memory of our computer, it is often the case that data is simultaneously accessed by several devices.Moreover, even if a device is procured with a large memory, it might still be preferable to limit the number of write operations.Writing to flash memory is slow and costly, and may also reduce the lifetime of the memory.Additionally, if the input is stored on removable devices, write-access may not be allowed due to technical or security reasons.Therefore, while many memory-constrained models exist, the general scheme is the following: The input resides in a read-only memory where data cannot be modified by the algorithm.The algorithms are allowed to store a few variables that reside in a local memory and can be modified as needed to solve the problem (usually called workspace).Since the output may also not fit in our local memory, the model provides us with a write-only memory where the desired output is sequentially reported by the algorithm.
In general, one might consider algorithms that are allowed to use a workspace of O(s) words for some parameter s, where a word is a collection of bits and is large enough to contain either an input item (such as a point coordinate) or a pointer into the input structure (of logarithmic size on the length of the input).The goal is then to design algorithms whose running time decreases as s increases, and that provide a nice trade-off between workspace size and running time.
Our results.For the case of EMST, Asano et al. [6] proposed an algorithm to compute the EMST of a set of n given sites in O(n 3 ) time using a workspace of O(1) words.In this paper, we revisit this problem and provide a time-space trade-off.Our algorithm computes the EMST in O(n 3 log s/s 2 ) time using O(s) additional words of workspace.This algorithm provides a smooth transition between the O(n 3 ) time algorithm [6] with constant words of workspace and the O(n log n) time algorithm [11] using a workspace of O(n) words.
As the main tool to achieve this running time, we introduce a compact representation of planar graphs, called an s-net.The main idea is to carefully choose a "dense" set of s edges of the graph for which we remember their face incidences.That is, we store whether or not any of these edges are incident to the same face of the graph.Moreover, the density property of this s-net guarantees that no path can walk along a face of the graph for long without reaching an edge of the s-net.This allows us to "quickly" find the face of the graph that any given edge lies on.More specifically, we use this structure to speed up the implementation of Kruskal's EMST algorithm on planar graphs using limited workspace.Recall that in this algorithm, edges are added in increasing order to an auxiliary graph.Moreover, for each of them we need to find out whether or not its endpoints lie on the same component of this auxiliary graph when the edge is inserted.If the original graph is planar, then this amounts to testing whether or not these endpoints are incident to the same face of the graph-a task for which the compact representation of the s-net allows us to obtain time-space trade-offs to compute the EMST of planar graphs.While the s-net is designed to speed up Kruskal's algorithm, this structure is of independent interest as it provides a compact way to represent planar graphs that can be exploited by other algorithms.
Related work.The study of constant-workspace algorithm started with the introduction of the complexity class LOGSPACE [3].After that, many classic problems were studied in this setting.Selection and sorting were among the first such problems [13,[20][21][22].In graph theory, Reingold [23] solved a long standing problem, and showed that connectivity in an undirected graph can be tested using constant workspace.The model was made popular in computational geometry by Asano et al. [6] who presented several algorithms to compute classic geometric data structures in the constant-workspace model.Algorithms with time-space trade-off for many of these problems were presented in subsequent years [1, 2, 4, 5, 7-10, 15, 16, 18], with the notable exception of the problem of computing the EMST which is finally addressed in this paper.

Preliminaries and Definitions
Let V be a set of n points (sites) in the plane.The Euclidean minimum spanning tree of V , EMST(V ), is the minimum spanning tree of the complete graph G on V , where the edges are weighted by the Euclidean distance between their endpoints.We assume that V is in general position, i.e., the edge lengths in G are pairwise distinct, thus EMST(V ) is unique.Given V , we can compute EMST(V ) in O(n log n) time using O(n) words of workspace [11].
The relative neighborhood graph of V , RNG(V ), is the undirected graph with vertex set V obtained by connecting two sites u, v ∈ V with an edge if and only if there is no site w ∈ V \ {u, v} such that both |uw| and |vw| is less than |uv|, where |uv| denotes the Euclidean distance between u and v [24].This is also known as the empty lens property, where the lens between u and v is the intersection of the disks of radius |uv| centered at both u and v; see Figure 1.One can show that a plane embedding of RNG(V ) is obtained by drawing the edges as straight line segments between the corresponding sites in V .Furthermore, The RNG for a set of sites V .The disks D u and D v have radius |uv| and are centered at u and v, respectively.The edge uv is in RNG(V ), since there is no site in V that lies in the lens each vertex in RNG(V ) has at most six neighbors, so that RNG(V ) has O(n) edges.We will denote the number of those edges by m.It is well-known that EMST(V ) is a subgraph of RNG(V ).In particular, this implies that RNG(V ) is connected.Given V , we can compute RNG(V ) in O(n log n) time using O(n) words of workspace [17,19,24].
Recall the classic algorithm by Kruskal to find EMST(V ) [14]: we start with an empty forest T , and we consider the edges of RNG(V ) one by one, by increasing weight.In each step, we insert the current edge e = vw into T if and only if there is no path between v and w in T .In the end, T will be EMST(V ).Since EMST(V ) is a subgraph of RNG(V ), it suffices to consider only the edges of RNG(V ).Thus, Kruskal's algorithm needs to consider m = O(n) edges and runs in O(n log n) time, using O(n) words of workspace.
Let s ∈ {1, . . ., n} be a parameter, and assume that we are given a set V of n sites in general position (as defined above) in a read-only array.The goal is to find EMST(V ), with O(s) words of workspace.We use RNG(V ) in order to compute EMST(V ).By general position, the edge lengths in RNG(V ) are pairwise distinct.Thus, we define E R = e 1 , . . ., e m to be the sorted sequence of the edges in RNG(V ), in increasing order of length.For i ∈ {1, . . ., m}, we define RNG i to be the subgraph of RNG(V ) with vertex set V and edge set {e 1 , . . ., e i−1 }.
In the limited workspace model, we cannot store RNG i explicitly.Instead, we resort to the computing instead of storing paradigm [6].That is, we completely compute the next batch of edges in E R whenever we need new edges of RNG(V ) in Kruskal's algorithm.To check whether a new edge e i ∈ E R belongs to EMST(V ), we need to check if e i connects two distinct components of RNG i .To do this with O(s) words of workspace, we will use a succinct representation of its component structure; see below.In our algorithm, we represent each edge e i ∈ E R by two directed half-edges.The two half-edges are oriented in opposite directions such that the face incident a half-edge lies to the left of it.We call the endpoints of a half-edge the head and the tail such that the half-edge is directed from the tail endpoint to the head endpoint.Obviously, each half-edge in RNG i has an opposing partner.However, in our succinct representation, we will rely on individual half-edges.Throughout the paper, directed half-edges will be denoted as − → e , and undirected edges as e.For a half-edge − → e = − → uv with u, v ∈ V , we call v the head of − → e , and u the tail of − → e .

The Algorithm
Before we discuss our algorithm, we explain how to compute batches of edges in RNG(V ) using O(s) words of workspace.A similar technique has been used previously in the context of Voronoi diagrams [8].
Lemma 3.1.Let V be a set of n sites in the plane, in general position.Let s ∈ {1, . . ., n} be a parameter.Given a set Q ⊆ V of s sites, we can compute for each u ∈ Q the at most six neighbors of u in RNG(V ) in total time O(n log s), using O(s) words of workspace.
Proof.The algorithm uses n/s steps.In each step, we process a batch of s sites of and produce at most six candidates for each site of Q to be in RNG(V ).In the first step, we take the first batch V 1 ⊆ V of s sites, and we compute RNG(Q ∪ V 1 ).Because both Q and V 1 have at most s sites, we can do this in O(s log s) time using O(s) words of workspace using standard algorithms.For each u ∈ Q, we remember the at most six neighbors of , then the lens of u and v is non-empty.That is, there is a witness among the points of Q ∪ V 1 that certifies that uv is not an edge of RNG(V ).Let N 1 be the set containing all neighbors in RNG(Q ∪ V 1 ) of all sites in Q. Storing N 1 , the set of candidate neighbors requires O(s) words of workspace.Then, in each step j = 2, . . ., O(n/s), we take next batch V j ⊆ V of s sites, and compute RNG(Q ∪ V j ∪ N j−1 ) in O(s log s) time using O(s) words of space.For each u ∈ Q, we store the set of at most six neighbors in this computed graph.Additionally, we let N j be the set containing all neighbors in RNG(Q ∪ V j ∪ N j−1 ) of all sites in Q.Note that N j , the set of candidate neighbors, consists of O(s) sites as each site in Q has degree at most six in the computed graph.
Therefore, after n/s steps, we are left with at most six candidate neighbors for each site in Q.As mentioned above, for a pair u ∈ Q, v ∈ V , if v is not among the candidate neighbors of u, then at some point in the construction there was a site witnessing that the lens of u and v is non-empty.Therefore, only the sites which are in the set of candidate neighbors can define edges of RNG(V ).However, all the candidate neighbors are not necessarily the neighbors in RNG(V ) of sites in Q.
To obtain the edges of RNG(V ) incident to the sites of Q, we take each site in Q and its corresponding neighbors in N n/s .Then, we go again through the entire set V = V 1 ∪ . . .∪ V n/s in batches of size s: for each u ∈ Q, we test the at most six candidate neighbors in N n/s against all elements of the current batch to test the empty-lens property.After going through all sites, the candidates that maintained the empty-lens property throughout define the edges of RNG(V ) incident to the sites of Q.Since we use O(s log s) time per step, and since there are n/s steps, the total running time is O(n log s) using O(s) words of workspace.
Through repeated application of Lemma 3.1, we can enumerate the edges of RNG(V ) by increasing lengths.
Lemma 3.2.Let V be a set of n sites in the plane, in general position.Let s ∈ {1, . . ., n} be a parameter.Let E R = e 1 , e 2 , . . ., e m be the sequence of edges in RNG(V ), by increasing length.Let i ≥ 1.Given e i−1 (or ⊥, if i = 1), we can find the edges e i , . . ., e i+s−1 in O(n 2 log s/s) time using O(s) words of workspace. 1roof.By applying Lemma 3.1 O(n/s) times, we can generate all the edges of RNG(V ).Because we obtain the edges in batches of size O(s), each taking O(n log s) time, the total time to compute all the edges amounts to O(n 2 log s/s).During this process, we find the edges e i , . . ., e i+s−1 of E R .This can be done with a trick by Chan and Chen [12], similar to the procedure in the second algorithm in [7].More precisely, whenever we produce new edges of RNG(V ), we store the edges that are longer than e i−1 in an array A of size O(s).Whenever A contains more than 2s elements, we use a linear time selection procedure to remove all edges of rank larger than s [14].This needs O(s) operations per step.We repeat this procedure for O(n/s) steps, giving total time O(n) for selecting the edges.In the end, we have e i , . . ., e i+s−1 in A, albeit not in sorted order.Thus, we sort the final A in O(s log s) time.The running time is dominated by the time needed to compute the edges of RNG(V ), so the claim follows.Lemma 3.2, together with the techniques from the original constant workspace EMST-algorithm by Asano et al. [6], already leads to a simple time-space trade-off for computing EMST(V ).Recall that we represent the edges of RNG(V ) as pairs of opposing half-edges, such that the face incident to a half-edge lies to its left.For i ∈ {1, . . ., m}, a face-cycle in RNG i is the circular sequence of half-edges that bounds a face in RNG i .All half-edges in a face-cycle are oriented in the same direction, and RNG i can be represented as a collection of face-cycles; see Figure 2. Asano et al. [6] observe that to run Kruskal's algorithm on RNG(V ), it suffices to know the structure of the face-cycles.Proof.Let u and v be the endpoints of e i .If there is a face-cycle C in RNG i that contains both u and v, then e i clearly does not belong to EMST(V ).Conversely, suppose there is no face-cycle in RNG i containing both u and v. Thus, any two face-cycles C u and C v such that u lies on C u and v lies on C v must be distinct.Since RNG(V ) is plane, C u and C v must belong to two different connected components of RNG i , and e i is an edge of EMST(V ).Observation 3.3 tells us that we can identify the edges of EMST(V ) if we can determine, for each i ∈ {1, . . ., m}, the face-cycles of RNG i that contain the endpoints of e i .To accomplish this task, we use the next lemma to traverse the face-cycles.Proof.Let w be the head of − → f .By comparing the edges incident to w with e i , we identify the incident half-edges of w in RNG i , in O(1) time.Then, among them we pick the half-edge − → f which has the smallest clockwise angle with − → f around w and has w as its tail.This takes O(1) time using O(1) words of workspace.
For j ≥ i ≥ 1, we define predecessor and successor of e j in RNG i regarding each endpoint w of e j as follows: the predecessor − → p w of e j is the half-edge in RNG i which has w as its head and is the first half-edge encountered in a counterclockwise sweep from e j around w.The successor − → s w of e j is the half-edge in RNG i which has w as its tail and is the first half-edge encountered in a clockwise sweep from e j around w; see Figure 3.If there is no edge incident to w in RNG i , we set p w , s w =⊥.
From our observations so far, we can already derive a simple time-space trade-off for computing EMST(V ).
Theorem 3.5.Let V be a set of n sites in the plane, in general position.Let s ∈ {1, . . ., n} be a parameter.We can output all the edges of EMST(V ), in sorted order, in O(n 3 log s/s) time using O(s) words of workspace.
Proof.We simulate Kruskal's algorithm on RNG(V ).For this, we take batches of s edges, sorted by increasing length, and we report the edges of EMST(V ) in each batch.Let E R = e 1 , . . ., e m be the edges of RNG(V ), sorted by length.To determine whether an edge e i ∈ E R is in EMST(V ), we apply Observation 3.3, i.e., we determine whether the endpoints of e i are on two distinct face-cycles of the corresponding RNG i .To do this, we process E R in batches of s edges, and for each edge, we perform a walk along the face-cycle that More precisely, we proceed as follows: first, we use Lemma 3.2 to find the next batch e i , . . ., e i+s−1 of s edges in E R , in O(n 2 log s/s) time.For each such edge e j , we pick an endpoint u j ∈ V .Using Lemma 3.1, we find for each u j first the incident edges in RNG(V ), and then the incident edges in RNG j (by comparing the edges from RNG(V ) with e j ).Then, we identify the successor of each e j in RNG j (if it exists), and we perform s parallel walks, where walk j takes place in RNG j .In each step, we have s current half-edges, and we use Lemma 3.1 and Lemma 3.4 to advance each half-edge along its face-cycle.This takes O(n log s) operations.A walk j continues until we either encounter the other endpoint of e j or until we arrive at the predecessor of e j in RNG j .In the latter case, e j is in EMST(V ), and we report it.In the former case, e j is not in EMST(V ).Since there are O(n) half-edges in RNG(V ), it takes O(n) steps to conclude all the walks.If follows that we can process a single batch of edges in O(n 2 log s) time.We have O(n/s) many batches, so the total running time of the algorithm is O(n 3 log s/s), using O(s) words of workspace.Theorem 3.5 is clearly not optimal: for the case of linear space s = n, we get a running time of O(n 2 log n), although we know that it should take O(n log n) time to find EMST(V ).Can we do better?The bottleneck in Theorem 3.5 is the time needed to perform the walks in the partial relative neighborhood graphs RNG j .In particular, such a walk might take up to Ω(n) steps, leading to a running time of Ω(n 2 log s) for processing a single batch.To avoid this, we will maintain a compressed representation of the partial relative neighborhood graphs that allow us to reduce the number of steps in each walk to O(n/s).
Let i ∈ {1, . . ., m}.An s-net N for RNG i is a collection of half-edges, called net-edges, in RNG i that has the following two properties: (i) each face-cycle in RNG i with at least n/s + 1 half-edges contains at least one net-edge; and (ii) for any net-edge − → e ∈ N , let C be the face-cycle of RNG i with − → e .Then, between the head of − → e and the tail of the next net-edge on C, there are at least n/s and at most 2 n/s other half-edges on C. Note that the next net-edge on C after − → e could be possibly − → e itself.In particular, this implies that face-cycles with less than n/s edges contain no net-edge.The following observation records two important properties of s-nets.Proof.Property (ii) implies that only face-cycle of RNG i with at least n/s +1 half-edges contain net-edges.Furthermore, on these face-cycles, we can uniquely charge Θ(n/s) half-edges to each net-edge, again by (ii).Thus, since there are O(n) half-edges in total, we have the first statement |N | = O(s).
For the second statement, we first note that if C contains less than 2 n/s half-edges, the claim holds trivially.Otherwise, C contains at least one net-edge, by property (i).Now, property (ii) shows that we reach a net-edge in at most 2 n/s steps from − → f .By Observation 3.6, we can store an s-net in O(s) words of workspace.This makes the concept of s-net useful in our time-space trade-off.Now, we can use the s-net in order to speed up the processing of a single batch.The next lemma shows how this is done: Lemma 3.7.Let i ∈ {1, . . ., m}, and let E i,s = e i , . . ., e i+s−1 be a batch of s edges from E R .Suppose we have an s-net N for RNG i in our workspace.Then, we can determine which edges from E i,s belong to EMST(V ), using O(n 2 log s/s) time and O(s) words of workspace.
Proof.Let F be the set of half-edges that contains all net-edges from N , as well as, for each batch-edge e j ∈ E i,s , the two successors of e j in RNG i , one for each endpoint of e j .By definition, we have |F | = O(s), and it takes O(n log s) time to compute F , using Lemma 3.1.Now, we perform parallel walks through the face-cycles of RNG i , using Lemma 3.1 and Lemma 3.4.We have one walk for each half-edge in F , and each walk proceeds until it encounters the tail of a half-edge from F (including the starting half-edge itself).By Lemma 3.4, in each step of these parallel walks we need O(n log s) time to find the next edge on the face-cycle and then we need O(s log s) time to check whether these new edges are in F .Because F contains the net-edges of N , by property (N2), each walk finishes after O(n/s) steps, and thus the total time for this procedure is O(n 2 log s/s).
Next, we build an auxiliary undirected graph H, as follows: the vertices of H are the endpoints of the halfedges in F .Furthermore, H contains undirected edges for all the half-edges in F and additional compressed edges, that represent the outcomes of the walks: if a walk started from the head u of a half-edge in F and ended at the tail v of a half-edge in F , we add an edge from u to v in H, and we label it with the number of steps that were needed for the walk.Thus, H contains F -edges, and compressed edges; see Figure 4. Clearly, after all the walks have been performed, we can construct H in O(s) time, using O(s) words of workspace.
Next, we use Kruskal's algorithm to insert the batch-edges of E i,s into H.This is done as follows: we determine the connected components of H, in O(s) time using depth-first search.Then, we insert the batch-edges into H, one after another, in sorted order.As we do this, we keep track of how the connected components of H change, using a union-find data structure [14].Whenever a new batch-edge connects two different connected components, we output it as an edge of EMST(V ).Otherwise, we do nothing.Note that even though H may have a lot more components than RNG i , the algorithm is still correct, by Observation 3.3.This execution of Kruskal's algorithm, and updating the structure of connected components of H takes O(s log s) time, which is dominated by the running time of O(n 2 log s/s) from the first phase of the algorithm.
Finally, we need to explain how to maintain the s-net during the algorithm.The following lemma shows how we can compute an s-net for RNG i+s , provided that we have an s-net for RNG i and the graph H described in the proof of Lemma 3.7, for each i ∈ {1, . . ., m}.Lemma 3.8.Let i ∈ {1, . . ., m}, and suppose we have the graph H derived from RNG i as above, such that all batch-edges have been inserted into H.Then, we can compute an s-net N for RNG i+s in time O(n 2 log s/s), using O(s) words of workspace.
Proof.By construction, all big face-cycles of RNG i+s , which are the faces with at least n/s + 1 half-edges appear as faces in H. Thus, by walking along all faces in H, and taking into account the labels of the compressed edges, we can determine these big face-cycles in O(s) time.The big face-cycles are represented through sequences of F -edges, compressed edges, and batch-edges.For each such sequence, we determine the positions of the half-edges for the new s-net N , by spreading the half-edges equally at distance n/s along the sequence, again taking the labels of the compressed edges into account.Since the compressed edges have length O(n/s), for each of them, we create at most O(1) new net-edges.Now that we have determined the positions of the new net-edges on the face-cycles of RNG i+s , we perform O(s) parallel walks in RNG i+s to actually find them.Using Lemma 3.4, this takes O(n 2 log s/s) time.
We now have all the ingredients for our main result which provides a smooth trade-off between the cubic time algorithm in constant workspace and the classical O(n log n) time algorithm with O(n) words of workspaces.
Theorem 3.9.Let V be a set of n sites in the plane, in general position.Let s ∈ {1, . . ., n} be a parameter.We can output all the edges of EMST(V ), in sorted order, in O(n 3 log s/s 2 ) time using O(s) words of workspace.
Proof.This follows immediately from Lemma 3.7 and Lemma 3.8, because we need to process O(n/s) batches of edges from E R .
For our algorithm, it suffices to update the s-net every time that a new batch is considered.It is however possible to maintain the s-net and the auxiliary graph H through insertions of single edges.This allows us to handle graphs constructed incrementally and maintain their compact representation using O(s) workspace words.We believe this is of independent interest and can be used by other algorithms for planar graphs in the limited-workspace model.

Figure 2 :Observation 3 . 3 .
Figure 2: A schematic drawing of RNG i is shown in black.The face-cycles of this graph are shown in gray.All the half-edges of a face-cycle are directed according to the arrows.

Lemma 3 . 4 .
Let i ∈ {1, . . ., m}.Suppose we are given e i ∈ E R and a half-edge − → f ∈ RNG i , as well as the at most six edges incident to the head of − → f in RNG(V ).Let C be the face-cycle of RNG i that − → f lies on.We can find the half-edge − → f that comes after − → f on C, in O(1) time using O(1) words of workspace.

Figure 3 :
Figure 3: A schematic drawing of RNG i is shown in black.The endpoint w = u, v of e j identifies the halfedges p w and s w as the predecessor and the successor of e j .They are shown in green and blue, respectively.

Observation 3 . 6 .
Let i ∈ {1, . . ., m}, and let N be an s-net for RNG i .Then, (N1) N has O(s) half-edges; and (N2) let − → f be a half-edge of RNG i , and let C be the face-cycle that contains it.Then, it takes at most 2 n/s steps along C from the head of − → f until we either reach a net-edge or the tail of − → f .

Figure 4 :
Figure 4: (a) A schematic drawing of RNG i is shown in gray.The half-edges of N are in black and the edges of the next batch E i,s are dashed red segments.(b) The auxiliary graph H including the batch-edges (in red).The graph H contains the net-edges (in black), and the successors of batch-edges and the compressed edges (which are combined in green paths in this picture).