# Conjugated Directions Methods – Part I

Let’s say we have the following unrestricted quadratic problem to solve:

$\displaystyle \min_{x}{x^TQx - b^Tx} \quad \quad (P)$

We further assume that $Q \in \mathbb{R}^{n+1}$ is a symmetric positive definite matrix.

Let’s say that we have an iterative algorithm to find the solution $x^\star$ of problem $(P)$. The algorithm works by selecting a “good” direction $d_k$ at iteration $k$ and then walk some amount $\alpha _k$ in that direction.

$x_{k+1} = x_k + \alpha _k d_k.$

Moreover, let’s say that we know how to compute $\alpha _k$ in order to minimize the problem:

$\displaystyle f(x_{k+1}) = \min_{\alpha _k}{f(x_k + \alpha _k d_k)}.$

A necessary condition for the latter minimization problem is to have:

$f^\prime(x_k + \alpha_k d_k)d_k = 0 \Leftrightarrow f^\prime (x_{k+1})d_k = 0.$

The question that it poses now is: How to choose the next direction $d_{k+1}$? Let’s be audacious and assume that somehow we can take the best direction possible, i.e., the direction that leads us to the optimal solution:

$d_{k+1} = x^\star - x_{k+1}.$

Obviously we don’t know the optimal solution and we need to find some condition on $d_{k+1}$ that doesn’t depend on $x^\star$. Fortunately such condition exists.

$f^\prime(x^\star) = 0 \Rightarrow f^\prime(x^\star)d_k = 0.$

Hence,

$\begin{array}{ll} & f^\prime (x_{k+1})d_k - f^\prime(x^\star)d_k = 0 \\[0.15in] \Leftrightarrow & \langle Qx_{k+1} - b, d_k \rangle - \langle Qx^\star - b, d_k \rangle = 0 \\[0.15in] \Leftrightarrow & \langle Q(x^\star - x_{k+1}),d_k \rangle = 0 \\[0.15in] \Leftrightarrow & \langle Qd_{k+1},d_k \rangle = 0. \end{array}$

Hum… So we pick the next direction such that $d_{k+1}^TQd_k = 0.$ Note that doesn’t mean $d_{k+1} =x^\star - d_k$, but it seems a good criteria to use in order to chose where to go next, as the best direction respects the same condition. One can show the following proposition:

Proposition: Let $\{d_k\}$ to be a sequence of directions such that  $d_{k+1}^TQd_k = 0$. Then $d_i^TQd_j = 0 \: \forall i \neq j$.

A set $\{d_k\}$ in which $d_i^TQd_j = 0 \: \forall i \neq j$ is said to be $Q$conjugated (It is a generalization of the notion of orthogonality. One can recover usual orthogonality between vectors by setting $Q = I$). At this point, we conclude that our mock-algorithm works with $Q$-conjugated directions. Let’s check on another property of $Q$-conjugated directions.

Proposition: Let $Q \in \mathbb{R}^{n+1}$ to be symmetric definite positive and $\{d_k\}$ to be a set of at most $n+1 Q$-conjugated directions. Then $\{d_k\}$ are linear independent.

Proof. Set $|\{d_k\}| = m <= n+1$ and suppose $\{d_k\}$ are not LI. Then there exists $\alpha_1, \alpha_2 \cdots \alpha_m$, not all zero, such that

$\alpha_1 d_1 + \cdots + \alpha_k d_k = 0$.

Choose $i<=m$ and multiply the last expression by $d_i^TQ$. Then,

$\alpha_i d_i^TQd_i = 0$.

But we know $d_i^TQd_i > 0$ as $Q$ is definite positive. The last expression is valid for all $i$, and then we conclude that $\alpha_i = 0 \; \forall i$, which is a contradiction $\square$

We can apply the trick of  multiplying by $d_i^TQ$ one more time. Since there are at most $n+1$ $Q$-conjugated directions, one can write the optimal solution $x^\star$ as:

$x^\star = \alpha_0 d_0 + \alpha_1 d_1 + \cdots + \alpha_n d_n.$

Multiplying by $d_i^TQ$ in both sides for some $i$:

$d_i^TQx^\star = \alpha_i d_i^TQd_i$.

Thus, it is easy to see that:

$\displaystyle \alpha_i = \frac{d_i^Tb}{d_i^TQd_i}.$

Hence, the optimal solution can be computed using our mock-algorithm if one has the $Q$-conjugated directions. The question now is how to get $Q$-conjugated directions? The answer comes in a future post.

Further Reference

1. David G. Luenberger and Yinyu Ye. 2015. Linear and Nonlinear Programming. Springer Publishing Company, Incorporated [Chapter 9].
2. Alexey Izmailov and Mikhail Solodov. 2012. Otimização Volume II – Métodos Computacionais. IMPA [Section 3.4].