Although the mathematical techniques described earlier, particularly Chapters 3-13, can be used to solve different types of engineering optimization problems, the use of engineering judgment and approximations help in reducing the computational effort involved. In this chapter we consider several types of approximation techniques that can speed up the analysis time without introducing too much error [1]. The approximation methods include the reduction of size of an optimization problem, fast reanalysis techniques, and use of derivatives of static displacements and stresses, eigenvalues and eigenvectors, and transient response of mechanical and structural systems in gradient evaluations, and also in predicting the response in the neighborhood of a base design. These techniques are especially useful in finite element analysis‐based optimization procedures. This chapter presents several types of approximation methods that can be used in practical computation, and also the use of derivatives of different structural/mechanical system response quantities to speed up the optimization process.
In the optimum design of certain practical systems involving a large number of (n) design variables, some feasible design vectors X 1, X 2, …, X r may be available to start with. These design vectors may have been suggested by experienced designers or may be available from the design of similar systems in the past. We can reduce the size of the optimization problem by expressing the design vector X as a linear combination of the available feasible design vectors as
where c 1, c 2, …, c r are the unknown constants. Then the optimization problem can be solved using c 1, c 2, …, c r as design variables. This problem will have a much smaller number of unknowns since r ≪ n. In Eq. (15.1), the feasible design vectors X 1, X 2, …, X r serve as the basis vectors. It can be seen that if c 1 = c 2 = ⋯ = c r = 1/r, then X denotes the average of the basis vectors.
When the number of elements or members in a structure is large, it is possible to reduce the number of design variables by using a technique known as design variable linking [15.17]. To see this procedure, consider the 12‐member truss structure shown in Figure 15.1. If the area of cross section of each member is varied independently, we will have 12 design variables. On the other hand, if symmetry of members about the vertical (Y) axis is required, the areas of cross section of members 4, 5, 6, 8, and 10 can be assumed to be the same as those of members 1, 2, 3, 7, and 9, respectively. This reduces the number of independent design variables from 12 to 7. In addition, if the cross‐sectional area of member 12 is required to be three times that of member 11, we will have six independent design variables only:
Once the vector X is known, the dependent variables can be determined as A 4 = A 1, A 5 = A 2, A 6 = A 3, A 8 = A 7, A 10 = A 9, and A 12 = 3A 11. This procedure of treating certain variables as dependent variables is known as design variable linking. By defining the vector of all variables as
the relationship between Z and X can be expressed as
where the matrix [T] is given by
The concept can be extended to many other situations. For example, if the geometry of the structure is to be varied during optimization (configuration optimization) while maintaining (i) symmetry about the Y axis and (ii) alignment of the three nodes 2, 3, and 4 (and 6, 7, and 4), we can define the following independent and dependent design variables:
Independent variables: X 5, X 6, Y 6, Y 7, Y 4
Dependent variables:
Thus the design vector X is
The relationship between the dependent and independent variables can be defined more systematically, by defining a vector of all geometry variables, Z, as
which is related to X through the relations
where f i denotes a function of X.
Let the displacement vector of the structure or machine, Y 0, corresponding to the load vector, P 0, be given by the solution of the equilibrium equations
or
where [K 0] is the stiffness matrix corresponding to the design vector, X 0. When the design vector is changed to X 0 + ΔX, let the stiffness matrix of the system change to [K 0] + [ΔK], the displacement vector to Y 0 + ΔY, and the load vector to P 0 + ΔP. The equilibrium equations at the new design vector, X 0 + ΔX, can be expressed as
or
Subtracting Eq. (15.7) from Eq. (15.10), we obtain
By neglecting the term [ΔK]ΔY, Eq. (15.11) can be reduced to
which yields the first approximation to the increment in displacement vector ΔY as
where [K 0]−1 is available from the solution in Eq. (15.8). We can find a better approximation of ΔY by subtracting Eq. (15.12) from Eq. (15.11):
or
By defining
Eq. (15.15) can be expressed as
Neglecting the term [ΔK] ΔY 2, Eq. (15.17) can be used to obtain the second approximation to ΔY, ΔY 2, as
From Eq. (15.16), ΔY can be written as
This process can be continued and ΔY can be expressed, in general, as
where ΔY i is found by solving the equations
Note that the series given by Eq. (15.20) may not converge if the change in the design vector, ΔX, is not small. Hence it is important to establish the validity of the procedure for each problem, by determining the step sizes for which the series will converge, before using it. The iterative process is usually stopped either by specifying a maximum number of iterations and/or by prescribing a convergence criterion such as
where ||ΔY i || is the Euclidean norm of the vector ΔY i and ε is a small number on the order of 0.01.
In structural optimization involving a static response, it is possible to conduct an approximate analysis at modified designs based on a limited number of exact analysis results. This results in a substantial saving in computer time, since, in most problems, the number of design variables is far smaller than the number of degrees of freedom of the system. Consider the equilibrium equations of the structure in the form
where [K] is the stiffness matrix, Y the vector of displacements, and P the load vector. Let the structure have n design variables denoted by the design vector
If we find the exact solution at r basic design vectors X 1, X 2, …, X r , the corresponding solutions, Y i , are found by solving the equations
where the stiffness matrix, [K i ], is determined at the design vector X i . If we consider a new design vector, X N , in the neighborhood of the basic design vectors, the equilibrium equations at X N can be expressed as
where [K N ] is the stiffness matrix evaluated at X N . By approximating Y N as a linear combination of the basic displacement vectors Y i , i = 1, 2, …, r, we have
where [Y] = [Y 1, Y 2, ⋯, Y r ] is an n × r matrix and c = {c 1, c 2, ⋯, c r }T is an r‐component column vector. Substitution of Eq. (15.26) into Eq. (15.25) gives
By premultiplying Eq. (15.27) by [Y]T we obtain
where
It can be seen that an approximate displacement vector Y N can be obtained by solving a smaller (r) system of equations, Eq. (15.28), instead of computing the exact solution Y N by solving a larger (n) system of equations, Eq. (15.25). The foregoing method is equivalent to applying the Ritz–Galerkin principle in the subspace spanned by the set of vectors Y 1, Y 2, …, Y r . The assumed modes Y i , i = 1, 2, …, r, can be considered to be good basis vectors since they are the solutions of similar sets of equations.
Fox and Miura [3] applied this method for the analysis of a 124‐member, 96‐degree‐of‐freedom space truss (shown in Figure 15.3). By using a 5‐degree‐of‐freedom approximation, they observed that the solution of Eq. (15.28) required 0.653 second while the solution of Eq. (15.25) required 5.454 seconds without exceeding 1% error in the maximum displacement components of the structure.
The gradient‐based optimization methods require the gradients of the objective and constraint functions. Thus the partial derivatives of the response quantities with respect to the design variables are required. Many practical applications require a finite‐element analysis for computing the values of the objective function and/or constraint functions at any design vector. Since the objective and/or constraint functions are to be evaluated at a large number of trial design vectors during optimization, the computation of the derivatives of the response quantities requires substantial computational effort. It is possible to derive approximate expressions for the response quantities. The derivatives of static displacements, stresses, eigenvalues, eigenvectors, and transient response of structural and mechanical systems are presented in this and the following two sections. The equilibrium equations of a machine or structure can be expressed as
where [K] is the stiffness matrix, Y the displacement vector, and P the load vector. By differentiating Eq. (15.31) with respect to the design variable x i , we obtain
where ∂[K]/∂x i denotes the matrix formed by differentiating the elements of [K] with respect to x i . Usually, the matrix is computed using a finite‐difference scheme as
where [K]new is the stiffness matrix evaluated at the perturbed design vector X + ΔX i , where the vector ΔX i contains Δx i in the ith location and zero everywhere else:
In most cases the load vector P is either independent of the design variables or a known function of the design variables, and hence the derivatives, ∂ P /∂x i , can be evaluated with no difficulty. Equation (15.32) can be solved to find the derivatives of the displacements as
Since [K]−1 or its equivalent is available from the solution of Eqs. (15.31) and (15.35) can readily be solved to find the derivatives of static displacements with respect to the design variables.
The stresses in a machine or structure (in a particular finite element) can be determined using the relation
where [R] denotes the matrix that relates stresses to nodal displacements. The derivatives of stresses can then be computed as
where the matrix [R] is usually independent of the design variables and the vector ∂ Y /∂x i is given by Eq. (15.35).
Let the eigenvalue problem be given by [4,6,10]
where λ is the eigenvalue, Y the eigenvector, [K] the stiffness matrix, and [M] the mass matrix corresponding to the design vector X = {x 1, x 2, ⋯, x n }T. Let the solution of Eq. (15.38) be given by the eigenvalues λ i and the eigenvectors Y i , i = 1, 2, …, m:
where [P i ] is a symmetric matrix given by
Premultiplication of Eq. (15.39) by gives
Differentiation of Eq. (15.41) with respect to the design variable x j gives
where Y i,j = ∂ Y i /∂x j . In view of Eq. (15.39), Eq. (15.42) reduces to
Differentiation of Eq. (15.40) gives
where ∂[K]/∂x j and ∂[M]/∂x j denote the matrices formed by differentiating the elements of [K] and [M] matrices, respectively, with respect to x j . If the eigenvalues are normalized with respect to the mass matrix, we have [10]
Substituting Eq. (15.44) into Eq. (15.43) and using Eq. (15.45) gives the derivative of λ i with respect to x j as
It can be noted that Eq. (15.46) involves only the eigenvalue and eigenvector under consideration and hence the complete solution of the eigenvalue problem is not required to find the value of ∂λ i /∂x j .
The differentiation of Eqs. (15.39) and (15.45) with respect to x j results in
where ∂[P i ]/∂x j is given by Eq. (15.44). Equations (15.47) and (15.48) can be shown to be linearly independent and can be written together as
By premultiplying Eq. (15.49) by
we obtain
The solution of Eq. (15.50) gives the desired expression for the derivative of the eigenvector, ∂ Y i /∂x j , as
Again it can be seen that only the eigenvalue and eigenvector under consideration are involved in the evaluation of the derivatives of eigenvectors.
For illustration, a cylindrical cantilever beam is considered [4]. The beam is modeled with three finite elements with six degrees of freedom as indicated in Figure 15.4. The diameters of the beam are considered as the design variables, x i , i = 1, 2, 3. The first three eigenvalues and their derivatives are shown in Table 15.3 [4].
Table 15.3 Derivatives of eigenvalues [4].
i | Eigenvalue, λ i | ||||
1 | 24.66 | 0.3209 | −0.1582 | 1.478 | −2.298 |
2 | 974.7 | 3.86 | −0.4144 | 0.057 | −3.046 |
3 | 7782.0 | 23.5 | 21.67 | 0.335 | −5.307 |
The equations of motion of an n‐degree‐of‐freedom mechanical/structural system with viscous damping can be expressed as [10]
where [M], [C], and [K] are the n × n mass, damping, and stiffness matrices, respectively, F(t) is the n‐component force vector, Y is the n‐component displacement vector, and a dot over a symbol indicates differentiation with respect to time. Equation (15.52) denotes a set of n coupled second‐order differential equations. In most practical cases, n will be very large and Eq. (15.52) are stiff; hence the numerical solution of Eq. (15.52) will be tedious and produces an accurate solution only for low‐frequency components. To reduce the size of the problem, the displacement solution, Y, is expressed in terms of r basis functions Φ 1, Φ 2, …, and Φ r (with r ≪ n) as
where
is the matrix of basis functions, Φ jk the element in row j and column k of the matrix [Φ], q an r‐component vector of reduced coordinates, and q k (t) the kth component of the vector q. By substituting Eq. (15.53) into Eq. (15.52) and premultiplying the resulting equation by [Φ]T, we obtain a system of r differential equations:
where
Note that if the undamped natural modes of vibration are used as basis functions and if [C] is assumed to be a linear combination of [M] and [K] (called proportional damping), Eq. (15.54) represent a set of r uncoupled second‐order differential equations which can be solved independently [10]. Once q(t) is found, the displacement solution Y(t) can be determined from Eq. (15.53).
In the formulation of optimization problems with restrictions on the dynamic response, the constraints are placed on selected displacement components as
where y j is the displacement at location j on the machine/structure and y max is the maximum permissible value of the displacement. Constraints on dynamic stresses are also stated in a similar manner. Since Eq. (15.59) is a parametric constraint in terms of the parameter time (t), it is satisfied only at a set of peak or critical values of y j for computational simplicity. Once Eq. (15.59) is satisfied at the critical points, it will be satisfied (most likely) at all other values of t as well [11,12]. The values of y i at which dy j /dt = 0 or the values of y i at the end of the time interval denote local maxima and hence are to be considered as candidate critical points. Among the several candidate critical points, only a select number are considered for simplifying the computations. For example, in the response shown in Figure 15.5, peaks a, b, c, …, j qualify as candidate critical points. However, peaks a, b, f, and j can be discarded as their magnitudes are considerably smaller (less than, for example, 25%) than those of other peaks. Noting that peaks d and e (or g and h) represent essentially a single large peak with high‐frequency undulations, we can discard peak e (or g), which has a slightly smaller magnitude than d (or h). Thus finally, only peaks c, d, h, and i need to be considered to satisfy the constraint, Eq. (15.59).
Once the critical points are identified at a reference design X, the sensitivity of the response, y j (X, t) with respect to the design variable x i at the critical point t = t c can be found using the total derivative of y j as
The second term on the right‐hand side of Eq. (15.60) is always zero since ∂y j /∂t = 0 at an interior peak (0 < t c < t max) and dt c /dx i = 0 at the boundary (t c = t max). The derivative, ∂y j /∂x i , can be computed using Eq. (15.53) as
where, for simplicity, the elements of the matrix [Φ] are assumed to be constants (independent of the design vector X). Note that for higher accuracy, the derivatives of Φ jk with respect to x i (sensitivity of eigenvectors, if the mode shapes are used as the basis vectors) obtained from an equation similar to Eq. (15.51) can be included in finding ∂y j /∂x i .
To find the values of ∂q k /∂x i required in Eq. (15.61), Eq. (15.54) is differentiated with respect to x i to obtain
The derivatives of the matrices appearing on the right‐hand side of Eq. (15.62) can be computed using formulas such as
where, for simplicity, [Φ] is assumed to be constant and ∂[M]/∂x i is computed using a finite‐difference scheme. In most cases the forcing function F(t) will be known to be independent of X or an explicit function of X. Hence the quantity ∂ /∂x i can be evaluated without much difficulty. Once the right‐hand side is known, Eq. (15.62) can be integrated numerically in time to find the values of ∂ /∂x i , ∂ /∂x i , and ∂ q /∂x i . Using the values of ∂ q /∂x i = {∂q k /∂x i } at the critical point t c , the required sensitivity of transient response can be found from Eq. (15.61).
Any optimum design problem involves a design vector and a set of problem parameters (or preassigned parameters). In many cases, we would be interested in knowing the sensitivities or derivatives of the optimum design (design variables and objective function) with respect to the problem parameters [17,18]. As an example, consider the minimum weight design of a machine component or structure subject to a constraint on the induced stress. After solving the problem, we may wish to find the effect of changing the material. This means that we would like to know the changes in the optimal dimensions and the minimum weight of the component or structure due to a change in the value of the permissible stress. Usually, the sensitivity derivatives are found by using a finite‐difference method. But this requires a costly reoptimization of the problem using incremented values of the parameters. Hence, it is desirable to derive expressions for the sensitivity derivatives from appropriate equations. In this section we discuss two approaches: one based on the Kuhn–Tucker conditions and the other based on the concept of feasible direction.
The Kuhn–Tucker conditions satisfied at the constrained optimum design X * are given by [see Eqs. (2.73) and (2.74)]
where J 1 is the set of active constraints and Eqs. (15.64,15.65,15.66) are valid with X = X * and λ j = . When a problem parameter changes by a small amount, we assume that Eqs. (15.64,15.65,15.66) remain valid. Treating f, g j , X, and λ j as functions of a typical problem parameter p, differentiation of Eqs. (15.64) and (15.65) with respect to p leads to
Equations (15.67) and (15.68) can be expressed in matrix form as
where q denotes the number of active constraints and the elements of the matrices and vectors in Eq. (15.69) are given by
The following can be noted in Eq. (15.69):
Once Eq. (15.69) are solved, the sensitivity of optimum objective value with respect to p can be computed as
The changes in the optimum values of x i and f necessary to satisfy the Kuhn–Tucker conditions due to a change Δp in the problem parameter can be estimated as
The changes in the values of Lagrange multiplier λ j due to Δp can be estimated as
Equation (15.77) can be used to determine whether an originally active constraint becomes inactive due to the change, Δp. Since the value of λ j is zero for an inactive constraint, we have
from which the value of Δp necessary to make the jth constraint inactive can be found as
Similarly, a currently inactive constraint will become critical due to Δp if the new value of g j becomes zero:
Thus, the change Δp necessary to make an inactive constraint active can be found as
Here we treat the problem parameter p as a design variable so that the new design vector becomes
As in the case of the method of feasible directions (see Section 7.7), we formulate the direction finding problem as
subject to
where the gradients of f and g j (j ∈ J 1) can be evaluated in the usual manner. The set J 1 can also include nearly active constraints (along with the active constraints), so that we do not violate any constraint due to the change, Δp. The solution of the problem stated in Eq. (15.83) gives a usable feasible search direction, S. A new design vector along S can be expressed as
where λ is the step length and the components of S can be considered as
so that
If the vector S is normalized by dividing its components by s n+1, Eq. (15.86) gives λ = Δp and hence Eq. (15.85) gives the desired sensitivity derivatives as
Thus the sensitivity of the objective function with respect to p can be computed as
Note that unlike the previous method, this method does not require the values of λ* and the second derivatives of f and g j to find the sensitivity derivatives. Also, if sensitivities with respect to several problem parameters p 1, p 2, … are required, all we need to do is to add them to the design vector X in Eq. (15.82).
with
If A 1 = 2 in.2, A 2 = 1 in.2, E 1 = E 2 = 30 × 106 psi, 2 l 1 = l 2 = 50 in., P 1 = 100 lb, and P 2 = 200 lb, determine
where ρ i , A i , and l i denote the mass density, area of cross section, and length of the segment i, and the stiffness matrix, [K], is given by Eq. (2) of Problem 9. If A 1 = 2 in.2, A 2 = 1 in.2, E 1 = E 2 = 30 × 106 psi, 2 l 1 = l 2 = 50 in., and ρ 1 g = ρ 2 g = 0.283 lb/in.3, determine
where I is the area moment of inertia of the cross section, E is Young's modulus, and l the length. Determine the displacements, Y i , and the sensitivities of the deflections, ∂Y i /∂d and ∂Y i /∂t(i = 1, 2), for the following data: E = 30 × 106 psi, l = 20 in., d = 2 in., t = 0.1 in., P 1 = 100 lb, and P 2 = 0.
where E is Young's modulus, I the area moment of inertia, l the length, ρ the mass density, A the cross‐sectional area, λ the eigenvalue, and Y = {Y 1, Y 2}T = eigenvector. If E = 30 × 106 psi, d = 2 in., t = 0.1 in., l = 20 in., and ρg = 0.283 lb/in.3, determine
where ω 1 and ω 2 are the natural frequencies of vibration of the system and c 1 and c 2 are constants. The stiffness of each helical spring is given by
where d is the wire diameter, D the coil diameter, G the shear modulus, and n the number of turns of the spring. Determine the values of ∂ω i /∂D and ∂ Y i /∂D for the following data: d = 0.04 in., G = 11.5 × 106 psi, D = 0.4 in., n = 10, and m = 32.2 lb ‐s 2/in. The stiffness and mass matrices of the system are given by
3.17.128.129