# 4-Local Search gives a 4-approximation algorithm to the Geometric Hitting Set Problem

Imagine you are the head of the education department of a metropole like São Paulo, the biggest city in Brazil. The mayor commended a survey to evaluate how well the schools were distributed around the town. Not that well, sadly. There exist some empty terrains and the mayor wants to use this space to build new schools, but he is not sure of how many to build neither where the best locations are. Your task is to assign the construction sites of the new schools.

The schools should be positioned following two criteria:

1. Every student must spend at most 30 minutes from home to the school.
2. Every school should be close enough from security safe facilities, in the case of an emergency.

Based on these, your team created a map covering the city with disks of different radii. Each disk is centered at a security safe facility, and a person can move from any two points inside each disk in less than 30 minutes. Your task is now reduced to find the minimum number of construction sites such that each disk has at least one school on it. This problem is also known as the minimum hit-set problem. The mayor was impressed with your solution and started to say your name for the press.

It has been a while since your idea became public, but no solution has appeared yet. In fact, you discovered the problem is impossible to be solved in practical time (it is NP-complete). Even worst, you also discovered that it is hard to approximate (it is W[1]-hard). The clock is ticking and you should come up with a solution that runs in a reasonable amount of time (running-time analysis) and that assures you are not far away from the optimal solution (quality analysis).

In this post, we are going to focus on the quality analysis of the local search algorithm for solving the minimum hitting-set for disks in the plane.

## Local search

General Minimization Problem: From a set $U$, find the subset $O \subseteq U$ that satisfies a property $P$, i.e., $P(O)=\text{True}$, and also minimizes some function $f:2^{U} \rightarrow \mathcal{R}$. In other words:

$\displaystyle O = arg\min_{P(X)=\text{True}} f(X).$

Local Search: It is a heuristic that can be used to approximate the solution of difficult optimization problems. Below is a general description of a $k$-local search algorithm for the general minimization problem with $f(X) = |X|$.

Input: $U$.
Output: $S \in U$.

1. Let $S=U$.
2. For each $k$-subset $S^{-}$ of  $S$.
1. For each $(k-1)$-subset $S^{+}$ of $U$.
1. Set $S^\prime = S\setminus S^{-} \cup S^{+}$.
2. If $P(S^\prime)==\text{True}$.
1. Set $S = S^\prime$.
2. Repeat.

This naive $k$-local search algorithm has complexity:

$\displaystyle \binom{n}{k} \cdot \binom{n}{k+1} \cdot n \in O(n^{2k}).$

Quite bad, and we don’t even know how close from the optimal solution it would be the solution returned by the local search. Nonetheless, it is generally the case that one can exploit characteristics of the problem in order to improve its running time. In particular, there exists a 3-local search algorithm for the hitting-set problem with running-time complexity of $\tilde{O}(n^{7/3})$ [1]. No further comment on running-time will follow, and we are going to focus on the quality analysis from now on.

Besides running-time analysis, approximation algorithms must also be analyzed for its quality, which means to state how far is the solution returned by the algorithm from the optimal solution. For example, a 2-approximation algorithm for a minimization problem is guaranteed to return a solution that is at most twice the value of the optimal solution:

$\displaystyle F(S) \leq 2 \cdot F(O).$

## Hitting-Set for disks in the plane

The input of the hitting-set problem are the sets $U=\mathcal{R}^2$ and a collection of subsets $\mathcal{D} \ni D \subseteq U$. The function $f(S)$ states if the chosen subset $S \subseteq U$ hits every set of the collection $\mathcal{D}$, i.e., every set $D \in \mathcal{D}$ has at least one element of $S$ on it.

We call the pair $\{U,\mathcal{D}\}$ a range space. Take any two disjoint subsets $R,B \subset U$. Consider a graph $G$ on vertices $R$ and $B$ such that for every $D \in \mathcal{D}$ in which $D \cap R \neq \emptyset$ and $D \cap B \neq \emptyset$ there exists at least one edge $\{r,b\}$ with $r \in R$ and $b \in B$. If graph $G(B,R,E)$ is bipartite and planar, the range space $\{U,\mathcal{D}\}$ is said to satisfy the locality condition.

There are many range spaces that satisfy the locality condition. The range spaces formed by disks in the plane is one of them.

Claim 1: For a collection of disks $\mathcal{D}$ in the plane and $R,B \subset \mathcal{R}^2$, it is possible to construct a bipartite planar graph $G(B,R,E)$ in which the locality condition is true.

Proof: We are going to need some quick facts about Delaunay Triangulation. It will be useful to take a look at some material on Delaunay Triangulations and Voronoi Diagrams if you are not familiar with them before proceeding the reading.

Let $V = B \cup R$ and consider the Delaunay Triangulation $DT$ of $V$. Recall that the circumcircle of every triangle of DT has no other point of $V$ (empty circle property). I am going to use this fact to show that if two points $r, b \in V$ forms an edge of DT then there exists a circle passing through $r$ and $b$ with no other point of $V$ in its interior.

Consider any two points $r,b \in V$ and trace the median line of $r$ and $b$ as shown in the figure.

From all the circles passing through $r$ and $b$, the one centered at $C$ is the smallest one. In particular, this circle is the one in which the diameter has the same length of the segment $rb$.

Suppose the edge $\{r,b\} \in DT$. Moreover, assume that for every circle with $r$ and $b$ in its boundary there exists at least one other points of $V$ in its interior. Notice this is only possible if there exist points $A,B \in V$ in the interior of the circle centered at $C$, implying the following partial triangulation:

But this triangulation does not respect the empty circle property of the Delaunay Triangulation, and the edge $\{r,b\}$ cannot be part of DT. That is a contradiction.

It is easy to see that any collection of disks $\mathcal{D}$ in the plane induces a connected graph on a Delaunay Triangulation. Assume there exists a collection of disks in which the induced graph is not connected. Take the smallest disk $D \in \mathcal{D}$ such that the induced graph is not connected and shrink the circle until there are two vertices $r,b$ in the boundary of the circle. If there is no other point of $U$ in the interior of the circle, the edge $\{r,b\}$ belongs to the Delaunay Triangulation and the graph is indeed connected. Otherwise, there exists a point $w \in V$ in the interior of the circle. There are two possibilities: Or vertices $r,w$ are in different components or vertices $b,w$ are in different components. In either way, it is possible to shrink even more the circle in order to get another one with two points on its boundary and that also induces a disconnected graph. But this contradicts the first circle being the smallest one with the required property.

By the last result, any collection of disks induces a connected graph in a Delaunay Triangulation, implying that for every disk $D$ in which $D \cap R \neq \emptyset$ and $D \cap B \neq \emptyset$ there exists at least one edge $\{r,b\}$ such that $r \in R$ and $b \in B$. To get a bipartite graph, we simply have to remove all the edges between pairs of vertices from $R$ and between pairs of points from $B \quad \square$.

The key idea in the quality analysis is to use the optimal set $O$ and the set $S$ returned by the $k$-local search in order to construct the bipartite planar graph $G(S,O,E)$. The reader might be wondering why we assume sets $O$ and $S$ are disjoint. Indeed, they are not. But we are going to argue that this won’t matter for the analysis.

Assume, for one moment, that sets $O$ and $S$ are disjoint. Let $I = S \cap O$ and define the following sets:

$S^\prime = S \setminus I, \qquad O^\prime = O \setminus I.$

The sets $S^\prime,O^\prime$ are disjoint. Further, assume there exists a constant $C$ such that

$|S^\prime| \leq C \cdot|O^\prime|.\qquad \qquad (1)$

Notice that $C \geq 1$, as $O$ is the optimal solution. But this implies

$\begin{array}{rl} |S^\prime| + |I| &\leq C \cdot |O^\prime| + C\cdot |I|\\[0.2in] |S| &\leq |O|. \end{array}$

Therefore, if we find the constant $C$ in which inequality (1) holds, we are able to guarantee that the $k$-local search solution is not further than $C$ times the optimal solution.

## Quality Analysis

When the $k$-local search finishes with solution $S$, there exists no $k$-subset $S^- \subset S$  and $(k-1)$-subset $S^+ \subset U$ such that $(S \setminus S^-\cup S^+)$ satisfies property $P$. There exists a bipartite planar graph $G(S,O,E)$ by claim 1 (here, $O$ is the optimal set). Further, as a consequence of $k$-local search, this graph has the $k$-locality property:

For every subset $S^\prime \subset S$ such that $|S^\prime| \leq k$, $|NG(S^\prime)| \geq |S^\prime|,$

where $NG(S^\prime) \subset O$ are the set of neighbors of the set $S^\prime$.

Theorem 1:  Let $G(B,R,E)$ be a bipartite planar graph such that for every subset $B^\prime \subset B$ of size $|B^\prime| \leq 4$ we have $|NG(B^\prime)| \geq |B^\prime|$. Then, $|B| \leq 4|R|$.

The proof is a little bit long, and just a sketch will be presented here. The full proof can be checked in the references.

From Euller’s formula for planar graphs:

$\begin{array}{rl} |E| &\leq 2|V|\\[0.2in] \displaystyle 2\cdot |B_{=2}| + 3 \cdot |B_{=3}| + 4 \cdot |B_{\geq4}| \leq \sum_{i}{i|B_{=i}|} &\leq \displaystyle 2 \cdot \left( |R| + |B| \right), \end{array}$

where $B_{=i}$ denotes the set of vertices of $B$ with degree $i$ and $B_{\geq i}$ the set of vertices of $B$ with degree greater then $i$. Then,

$\displaystyle |B_{\geq 4}| \leq |R| - \frac{|B_{=3}|}{2}.$

Therefore, in order to prove $|B| \leq 4|R|$, it is enough to show that

$\begin{array}{rl} |B| &\leq 4|R|\\[0.2in] \displaystyle |B_{=2}| + |B_{=3}| + |B_{\geq4}| \leq |B_{=2}| + \frac{|B_{=3}|}{2} + |R| &\leq 4|R|\\[0.2in] \displaystyle |B_{=2}| + \frac{|B_{=3}|}{2} & \displaystyle \leq 3|R|. \end{array}$

In order to show the latter inequality, construct the auxiliar graph $H(R,\hat{E})$:

$\displaystyle \hat{E} = \left\{ \: \{r_1,r_2\} \: \vert \: \exists b \in B \text{ such that } \{b,r_1\}, \{b,r_2\} \in E \right\}$

Assume the graph $H$ is $2$-connected. The proof is also valid for the case in which is not, but it is a technicality that makes the proof harder to follow (more details in the references). Consider any embedding of the graph $H$ in the plane. Vertices of $B_{=2}$ are identified as edges in $H$ and vertices of $B_{=3}$ are drawn in the interior of faces of $H$. Moreover, edges in $H$ can be single or double. An edge $\{r_1,r_2\} \in \hat{E}$ is a double edge if there exist vertices $b_1,b_2 \in B$ such that $\{b_1,r_1\},\{b_1,r_2\},\{b_2,r_1\},\{b_2,r_2\} \in E$. Notice that it is not possible to have a triple edge, since that would mean graph $G$ would not have the $4$-locality-property.

Some notation will be helpful. Let $F_i$ to denote a face of $H$ with $i$ edges in its boundary; a face with maximum number of double edges in its boundary is called a full face; and naturally define $B_{=3}^f (E_1^f,E_2^f)$ as the set of vertices of $B$ of degree 3 (single and double edges) in the interior (boundary) of face $f$.

It is easy to see that the following claims are true.

Claim 2: For $i\geq 4$, let $f \in F_i$. Then $|E_2^f| \leq \frac{i}{2}$. Further, a triangular face has no double edge.

Claim 3: For $i \geq 4$, let $f \in F_i$. Then $|B_{=3}^f|\leq (i-2)$.

Claim 4: For $i \geq 4$, let $f \in F_i$ be a full face. If $i$ is even, then $|B_{=3}^f | \leq (i-4)$. If $i$ is odd then $|B_{=3}^f| \leq (i-3)$.

The key idea of the proof is to define a weight function $w$ for each face $f$ in the graph $H$.

$\displaystyle w(f) = \frac{|E_{1}^f|}{2} +|E_{2}^f| + \frac{|B_{=3}^f|}{2}$.

Extending this definition to the graph $H$

$\begin{array}{rl} \displaystyle w(H) = \sum_{f \in F} w(f) &= \displaystyle |E_1| + 2|E_2| + \frac{|B_{=3}^f|}{2}\\[0.2in] &= \displaystyle |B_{=2}| + \frac{|B_{=3}^f|}{2}. \end{array}$

By using claims 2-4, it is possible to show that (details in [3])

$w(H) \leq 3|R|.$

## Concluding Remarks

In fact, the quality analysis just shown can be used for every problem in which it is possible to construct the locality graph $G(B,R,E)$ from any two disjoint subsets $B,R \subset U$. In particular, the 4-local search heuristic also gives a 4-approximation algorithm for the following problems:

• Maximum independent set in the intersection graph of disks in the plane.
• Terrain guarding problem.
• Minimum dominating set in disk intersection graphs.
• Minimum dominating set in disk intersection graphs.
• Minimum set-cover for disks in the plane.