11 Secant
Our next method, the secant method, is a kind of compromise between Newton’s method and bisection method. There are two perspectives on the method.
11.1 Two Perspectives
11.1.1 Newton
The derivative is a fairly delicate invariant, and it might be hard to calculate – in practice, we’d probably just estimate it with a secant line on a small interval. The name of the secant method comes from this. Instead of using a tangent through the previous estimate to find the next, we make a secant through the previous two and find its root.
So, from two initial guesses \(x_{-1}\) and \(x_0\), we inductively define a sequence \((x_n)_{n\in\mathbb N}\) through the following process. Form the secant line \[L_{n+1}(x) = f(x_n) + \frac{f(x_n) - f(x_{n-1})}{x_n-x_{n-1}} (x-x_n),\] then let \(x_{n+1}\) be its root. After some algebra, we obtain \[x_{n+1} = x_n - \frac{x_n-x_{n-1}}{f(x_n) - f(x_{n-1})}f(x_n).\] An alternative, more symmetric form, is \[x_{n+1} = \frac{x_nf(x_{n-1})-x_{n-1}f(x_n)}{f(x_n) - f(x_{n-1})}.\]
Since the secant line is necessarily a less accurate approximation of \(f\) than the actual BLA, we expect greater error from secant method – it will probably converge more slowly than Newton’s method.
11.1.2 Bisection
What if we wanted to improve the bisection method? At each step, the bisection method produces a midpoint. However, if the value of \(f\) at one endpoint was very small, it stands to reason that a root might be closer to that side. Rather than a plain average of \(a\) and \(b\), a weighted average, closer to the smmaller of \(f(a)\) and \(f(b)\) could produce a better guess.
If \(f(a) < 0 < f(b)\), that would be the following, \[c = SM(a,b) = \frac{f(b)}{f(b) - f(a)}a - \frac{f(a)}{f(a) - f(b)}b.\] (the signs could be replaced by absolute values)
With this as the midpoint function in bisection method, we might get faster convergence. However, the size of the interval no longer decreases in a simple way, so we might lose the guarantee of convergence that bisection method had.
To be precise, the modified secant/bisection method starts from endpoints \(a_0,b_0\) such that \(f(a_0)\) and \(f(b_0)\) have different signs. Then, we construct sequences \(a_k,b_k,c_k\) where \(c_k = SM(a_k,b_k)\). If \(c_k\) is a root, we can stop, and otherwise it has a different sign than either \(a_k\) or \(b_k\). If the former, \(a_{k+1} = a_k\) and \(b_{k+1} = c_k\), and the reverse for the latter. By construction, the sequences are monotonic and bounded, hence converge. However, we lose to bisection guarantee that they converge to the same value (see the exercises below).
Notice that the weighted average for the alternate bisection method is precise the root of a line between \((a,f(a))\) and \((b,f(b))\), which is just a secant line. Or just compare to the symmetric iteration from the Newton’s method point of view; the two differ only by a change of variable.
11.2 Convergence and Error
In spite of their similarities, the two methods will lead to different sequences, because the modified bisection method produces endpoint sequences \(a_k,b_k\) so that \(f(a_k)\) and \(f(b_k)\) always have different signs. The two methods will only be the same if the Newton sequence alternates in sign, which is unlikely.
Our analysis will focus on the (simpler) Newton version.
Proposition 11.1 Suppose \(f\) is differentiable. If the a secant method sequence converges to some \(x\) such that \(f'(x)\neq 0\), then \(x\) is a root of \(f\).
Proof. Note that \(\frac{x_n-x_{n-1}}{f(x_n) - f(x_{n-1})}\) is the reciprocal of the usual quotient appear in the definition of the derivative. So taking the limit of both sides furnishes \[x = x - \frac 1{f'(x)} f(x),\] hence \(f(x) = 0\).
The error is almost exactly what one would predict based on Newton’s method.
Theorem 11.1 Let \(f\) be twice continuously differentiable and \(r\) a simple root of \(f\). Then there is a neighborhood of \(r\) such that the secant method sequence \((x_n)_{n\in \mathbb N}\) converges to \(r\), and moreover there is a constant \(C\) such that the errors \(e_n = |r-x_n|\) are bounded \[e_{n+1} \leq C e_n e_{n-1}.\]
Proof. Note that consecutive terms of the sequence are distinct unless once of them is a root. So we can safely assme \(x_n\neq x_{n-1}\).
Now, write \[x_{n+1} - r = \frac{f(x_{n-1})e_n - f(x_n)e_{n-1}}{f(x_n) - f(x_{n-1})},\] then factor out \(r-x_n\) and \(r-x_{n-1}\), and multiply/divide by the nonzero factor \(x_n - x_{n-1}\) to get \[e_{n+1} = \left|\frac{ \frac{f(r) - f(x_{n-1})}{r - x_{n-1}} - \frac{f(r) - f(x_n)}{r - x_n}}{x_n - x_{n-1}}\right|\left|\frac{x_n - x_{n-1}}{f(x_n) - f(x_{n-1})}\right|e_ne_{n-1}.\]
We can bound each fraction separately. Suppose \(x_n,x_{n-1}\) are within \(\delta\) of \(r\).
The first fraction is actually a familiar quantity – it’s the divided difference \(f[r,x_{n-1},x_n]\), so the higher order mean value theorem lets us replace it with \(\frac 1 2 f''(y_n)\) for some \(y_n\) within \(\delta\) of \(r\). The regular MVT handles the second, which must be \(\frac 1{f'(z_n)}\) for some \(z_n\) also within \(\delta\) of \(r\). In total,
\[e_{n+1} = \frac 12 \frac{f''(y_n)}{f'(z_n)}e_ne_{n-1}.\]
As \(\delta \to 0\), the coefficients \(C_n\) go to \(\frac 12 \frac{f''(r)}{f'(r)}\) so continuity lets us bound \(\frac 12 \frac{f''(y_n)}{f'(z_n)}\) by some constant \(C\) when \(x_n,x_{n-1}\) are close enough to \(r\). If we start from \(e_n,e_{n-1} < \delta / C\) for some \(0 < \delta < 1\), then the above gives us \[e_{n+1} < C \frac \delta C \frac \delta C = \delta \frac{\delta}C\] and so \(e_{n+1}\) is even closer.
We know that the errors in bisection and newton’s method improve linearly and quadratically, respectively, at each step, which refers to the exponents in the bounds \[e_{n+1} \leq \frac 12 e_n\ \ \ \ \textrm{respectively}\ \ \ \ e_{n+1} \leq C e_n^2.\]
Our expectation is that secant method is between the two, so we’d like to say \[e_{n+1} \leq C e_n^\alpha\] for some \(1 < \alpha < 2\).
Proposition 11.2 Suppose that \(\alpha_n\) is a sequence of positive real numbers such that \[\alpha_{n+1} \leq \alpha_n\alpha_{n-1},\] starting from \(\alpha_{-1},\alpha_0 < M\). Then \[\alpha_n \leq M^{F_n}\] where \(F_n\) is the \(n\)th Fibonacci number, defined by the recurrence \(F_{n+1} = F_n + F_{n-1}\) from \(F_{-1} = 0, F_0 = 1\).
Proof. Let \(k_n = \log_M \alpha_n\). We will prove by induction that \(k_n\leq F_n\). The bound we have says \[k_{n+1} \leq k_n + k_{n-1}\] and \(k_{-1},k_0 \leq 1\). By induction, then, \[k_{n+1} \leq k_n + k_{n-1} \leq F_n + F_{n-1} = F_{n+1}.\]
Corollary 11.1 Suppose \(e_n,C,\delta\) are as in the Theorem 11.1, so that the sequence converges. Then
Proof. Let \(a_n = C e_n\). By construction, \(e_0,e_1 < \delta/C\). Then the error bound Theorem 11.1 becomes \[a_n = Ce_n \leq C^2 e_ne_{n-1} = a_n a_{n-1}\] and so Proposition 11.2 tells us \[e_n \leq \left(\frac \delta C \right)^{F_n}\]
Recall the Fibonacci example from our review of linear algebra, which connected the Fibonacci numbers to the golden ratio: \[F_n = \frac{phi^n - \psi^n}{\sqrt 5}.\]
We could use this to rewrite the error bound as \[e_n \leq \left(\frac \delta C \right)^{\phi^n/\sqrt 5} \left(\frac \delta C \right)^{-\psi^n/\sqrt 5}.\]
Since \(\psi^n\) decreases monotonically to \(0\) $ as \(n\to \infty\), and \(|\psi | < 1\), the second term is bounded (in absolute value) is bounded above by \(\delta/C\). Therefore, the error at each step improves by at least \((\delta/C)^\phi\), and so it converges with order \(\psi \approx 1.62\).
Exercises
Exercise 11.1 Suppose the sequences \((a_n)\) and \((b_n)\) from the modified secant method converge to the same point. Then that point is a root.
Exercise 11.2 Take sequences \((a_n)\) and \((b_n)\) from the modified secant method. Prove that if one and only one of the sequences is eventally constant, then the other sequence converges to a root.
Hint: the function \(SM(a,b)\) is continuous in both variables if \(f(a)\neq f(b)\). By construction, \(f(a_k) \neq f(b_k)\) because their signs differ.
Exercise 11.3 Take sequences \((a_n)\) and \((b_n)\) from the modified secant method. Suppose that \(\lim b_n\) is not a root. Prove that \(\lim b_n\) is a root.
Exercise 11.4 Show that it is not possible for the sequences \((a_n)\) and \((b_n)\) to converge to distinct points unless one is eventually constant. Equivalently, if neither sequence is eventually constant then both converge to the same point, necessarily a root.
Exercise 11.5 Adapt the convergence proof for the easy case of Newton’s method to show that the secant method also always converges under the following assumptions about the function \(f\) on the interval \([a,b]\).
- \(f\) is twice continuously differentiable
- \(f' > 0\)
- \(f''> 0\)
- \(f\) has a root \(x\) in the interval
- the two initial guesses \(x_0,x_1\) are both to the right of the root.
Hint: you will have to use convexity in a slightly more interesting way than in NM – the graph of \(f\) does not lie above the secant line, but enough of it does.