A parsimonious way of constructing quadratic models from values of the objective function in derivative-free optimization


Prof. M.J.D. Powell


University of Cambridge


Mon, 20/02/2012 - 11:00am to 12:00pm




Let the minimum of $F ( \underline{x} )$, $\underline{x} \!\in\! \mathcal{R}^n$, be required when first derivatives of $F$ are not available. We consider iterative methods that employ $( n \!+\! 1 )$ values of $F$ on each iteration, at points in $\mathcal{R}^n$ that allow the values to be interpolated uniquely by a linear polynomial. Let $k$ be the iteration number, let $F ( \underline{x}_k )$ be the least of the $(n \!+\! 1 )$ function values, let $Q_k ( \underline{x} )$, $\underline{x} \!\in\! \mathcal{R}^n$, be the linear polynomial, and let $\rho_k$ be a trust region radius. The basic trust region step from $\underline{x}_k$, say $\underline{d}_k$, is the vector $\underline{d}$ that minimizes $Q_k ( \underline{x}_k \!+\! \underline{d} )$ subject to $\| \underline{d} \| \!\leq\! \rho_k$. A trust region iteration calculates the new value $F ( \underline{x}_k \!+\! \underline{d}_k )$ of the objective function, the point $\underline{x}_k \!+\! \underline{d}_k$ replaces one of the current interpolation points, $\underline{x}_{k+1}$ is set to $\underline{x}_k$ or $\underline{x}_k \!+\! \underline{d}_k$ in the case $F ( \underline{x}_k \!+\! \underline{d}_k ) \!\geq\! F ( \underline{x}_k )$ or $F ( \underline{x}_k \!+\! \underline{d}_k ) \!

It may be more efficient, however, to let $Q_k$ be a quadratic polynomial instead of a linear one, but we wish to preserve the advantages in practice and in theory of retaining only $n \!+\! 1$ interpolation points. Our theory remains valid if the second derivative matrix $\nabla^2 Q_k$ is symmetric, bounded, and otherwise arbitrary for every $k$. Then, after choosing $\nabla^2 Q_k$, the current interpolation conditions supply the value $Q_k ( \underline{x}_k )$ and the gradient $\underline{\nabla} Q_k ( \underline{x}_k )$. We consider the following automatic way of obtaining some second derivative information. When an interpolation point is replaced, $n \!+\! 2$ values of $F$ are available. We require the new quadratic to interpolate these values, and the remaining freedom in the new quadratic model is taken up by minimizing the Frobenius norm of the change to $\nabla^2 Q_k$. Also we set $\nabla^2 Q_1$ to the zero matrix. Experiments show that this technique for constructing quadratic models provides major improvements over the use of linear models, both in the total number of function evaluations and in the final accuracy of the optimization calculation.

School Seminar Series: