Optimization

Optimization involves finding extreme values of functions or their roots. We have already seen the power of optimization in the curve-fitting arena, but it does not stop there. There are applications to virtually every single branch of engineering, and robust algorithms to perform these tasks are a must in every scientist's toolbox.

The curve_fit routine is actually syntactic sugar for the general algorithm that performs least-squares minimization, leastsq, with the imposing syntax:

leastsq(func, x0, args=(), Dfun=None, full_output=0,
        col_deriv=0, ftol=1.49012e-8, xtol=1.49012e-8,
        gtol=0.0, maxfev=0, epsfcn=0.0, factor=100, diag=None):

For instance, the curve_fit routine could have been called with a leastsq call instead:

leastsq(error_function,p0,args=(x,y))

Here, error_function is equal to lambda p,x,y: target_function(x,p[0],p[1],p[2])-y

The implementation is given in the corresponding section on the IPython Notebook of this chapter. Most of the optimization routines in SciPy can be accessed from either native Python code, or as wrappers for Fortran or C classical implementations of their corresponding algorithms. Technically, we are still using the same packages we did under Fortran or C, but from within Python. For instance, the minimization routine that implements the truncated Newton method can be called with fmin_ncg (and this is purely Python) or as fmin_tnc (and this one is a wrap of a C implementation).

Minimization

For general minimization problems, SciPy has many different algorithms. So far, we have covered the least-squares algorithm (leastsq), but we also have brute force (brute), simulated annealing (anneal), Brent or Golden methods for scalar functions (brent or golden), the downhill simplex algorithm (fmin), Powell's method (fmin_powell), nonlinear conjugate gradient or Newton's version of it (fmin_cg, fmin_ncg), and the BFGS algorithm (fmin_bfgs).

Constrained minimization is also possible computationally, and SciPy has routines that implement the L-BFGS-S algorithm (fmin_l_bfgs_s), truncated Newton's algorithm (fmin_tnc), COBYLA (fmin_cobyla), or sequential least-squares programming (fmin_slsqp).

The following code, for example, compares the output of all different methods to finding a local minimum of the Rosenbrock function, scipy.optimize.rosen, near the origin using the downhill simplex algorithm:

>>> import scipy.optimize
>>> scipy.optimize.fmin(scipy.optimize.rosen,[0,0])

The output is as follows:

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 79
         Function evaluations: 146
array([ 1.00000439,  1.00001064])

Since the Version 0.11 of SciPy, all minimization routines can be called from the generic scipy.optimize.minimize, with the method parameter pointing to one of the strings, such as Nelder-Mead (for the downhill simplex), Powell, CG, Newton-CG, BFGS, or anneal. For constrained minimization, the corresponding strings are one of L-BFGS-S, TNC (for truncated Newton's), COBYLA, or SLSQP:

minimize( fun, x0, args=(), method='BFGS', jac=None, hess=None, hessp=None, bounds=None, constraints=(),tol=None, callback=None, options=None)

Roots

For most special functions included in the scipy.special module, we have accurate algorithms that allow us to their zeros. For instance, for the Bessel function of first kind with integer order, jn_zeros, offers as many roots as desired (in ascending order). We may obtain the first three roots of the Bessel J-function of order four by issuing the following command:

>>> import scipy.special
>>> print (scipy.special.jn_zeros(4,3))

The output is as follows:

[  7.58834243  11.06470949  14.37253667]

For nonspecial scalar functions, the scipy.optimize module allows approximation to the roots through a great deal of different algorithms. For scalar functions, we have the crude bisection method (bisect), the classical secant method of Newton-Raphson (newton), and more accurate and faster methods such as Ridders' algorithm (ridder), and two versions of the Brent method (brentq and brenth).

Finding roots for functions of several variables is very challenging in many ways; the larger the dimension, the more difficult it is. The effectiveness of any of these algorithms depends on the problem, and it is a good idea to invest some time and resources in knowing them all. Since Version 0.11 of SciPy, it is possible to call any of the designed methods with the same routine scipy.optimize.root, which has the following syntax:

root(fun, x0, args=(), method='hybr', jac=None, tol=None, callback=None, options=None)

The different methods are obtained upon changing the value of the method parameter to a method's string. We may choose from methods such as 'hybr' for a modified hybrid Powell's method; 'lm' for a modified least-squares method; 'broyden1' or 'broyden2' for Broyden's good and bad methods, respectively; 'diagbroyden' for the diagonal Broyden Jacobian approximation; 'anderson' for Anderson's extended mixing; 'Krylov' for Krylov approximation of the Jacobian; 'linearmixing' for scalar Jacobian approximation; and 'excitingmixing' for a tuned diagonal Jacobian approximation.

For large-scale problems, both the Krylov approximation of the Jacobian or the Anderson extended mixing are usually the best options.

Let's present an illustrative example of the power of these techniques. Consider the following system of differential equations:

Roots

We use the plot routine quiver from the matplotlib.pyplot libraries to visualize a slope field for values of x and y between -0.5 and 2.5, and hence identify the location of the possible critical points in that region:

>>> import numpy 
>>> import matplotlib.pyplot as plt 
>>> f=lambda x: [x[0]**2 - 2*x[0] - x[1] + 0.5, x[0]**2 + 4*x[1]**2 - 4]
>>> x,y=numpy.mgrid[-0.5:2.5:24j,-0.5:2.5:24j]
>>> U,V=f([x,y])
>>> plt.quiver(x,y,U,V,color='r', 
         linewidths=(0.2,), edgecolors=('k'), 
         headaxislength=5)
>>> plt.show()

The output is as follows:

Roots

Note how there is a whole region of the plane in which the slopes are extremely small. Because of the degrees of the polynomials involved, there are at most four different possible critical points. In this area, we should be able to identify two such points (as a matter of fact there are only two noncomplex solutions). One of them seems to be near (0, 1) and the second one is near (2, 0). We use these two locations as initial guesses for our searches:

>>> import scipy.optimize
>>> f=lambda x: [x[0]**2 - 2*x[0] - x[1] + 0.5, x[0]**2 + 4*x[1]**2 - 4]
>>> scipy.optimize.root(f,[0,1])

The output is as follows:

  status: 1
 success: True
qtf: array([ -4.81190247e-09,  -3.83395899e-09])
nfev: 9
       r: array([ 2.38128242, -0.60840482, -8.35489601])
     fun: array([  3.59529073e-12,   3.85025345e-12])
       x: array([-0.22221456,  0.99380842])
 message: 'The solution converged.'
fjac: array([[-0.98918813, -0.14665209],
       [ 0.14665209, -0.98918813]])

Let's look at second case:

>>> scipy.optimize.root(f,[2,0])

The output is as follows:

  status: 1
 success: True
qtf: array([  2.08960516e-10,   8.61298294e-11])
nfev: 12
       r: array([-4.56575336, -1.67067665, -1.81464307])
     fun: array([  2.44249065e-15,   1.42996726e-13])
       x: array([ 1.90067673,  0.31121857])
 message: 'The solution converged.'
fjac: array([[-0.39612596, -0.91819618],
       [ 0.91819618, -0.39612596]])

In the first case, we converged successfully to (-0.22221456, 0.99380842). In the second case, we converged to (1.90067673, 0.31121857). The routine gives us the details of the convergence and the properties of the approximation. For instance, nfev tells us about the number of function calls performed, and fun indicates the output of the function at the found location. The other items in the output reflect the matrices used in the procedure, such as qtf, r, and fjac.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.212.124