5 Advanced evolutionary approaches

This chapter covers

  • Considering options for the various steps in the genetic algorithm life cycle

  • Adjusting a genetic algorithm to solve varying problems

  • The advanced parameters for configuring a genetic algorithm life cycle based on different scenarios, problems, and datasets

NOTE Chapter 4 is a prerequisite to this chapter.

Evolutionary algorithm life cycle

The general life cycle of a genetic algorithm is outlined in chapter 4. In this chapter, we consider other problems that may be suitable to be solved with a genetic algorithm, why some of the approaches demonstrated thus far won’t work, and alternative approaches.

As a reminder, the general life cycle of a genetic algorithm is as follows:

  • Creating a population—Creating a random population of potential solutions.

  • Measuring fitness of individuals in the population—Determining how good a specific solution is. This task is accomplished by using a fitness function that scores solutions to determine how good they are.

  • Selecting parents based on their fitness—Selecting pairs of parents that will reproduce offspring.

  • Reproducing individuals from parents—Creating offspring from their parents by mixing genetic information and applying slight mutations to the offspring.

  • Populating the next generation—Selecting individuals and offspring from the population that will survive to the next generation.

Keep the life cycle flow (depicted in figure 5.1) in mind as we work through this chapter.

Figure 5.1 Genetic algorithm life cycle

This chapter starts by exploring alternative selection strategies; these individual approaches can be generically swapped in and out for any genetic algorithm. Then it follows three scenarios that are tweaks of the Knapsack Problem (chapter 4) to highlight the utility of the alternative encoding, crossover, and mutation approaches (figure 5.2).

Figure 5.2 The example Knapsack Problem

Alternative selection strategies

In chapter 4, we explored one selection strategy: roulette-wheel selection, which is one of the simplest methods for selecting individuals. The following three selection strategies help mitigate the problems of roulette-wheel selection; each has advantages and disadvantages that affect the diversity of the population, which ultimately affects whether an optimal solution is found.

Rank selection: Even the playing field

One problem with roulette-wheel selection is the vast differences in the magnitude of fitness between chromosomes. This heavily biases the selection toward choosing individuals with high fitness scores or giving poor-performing individuals a larger chance of selection than desired. This problem affects the diversity of the population. More diversity means more exploration of the search space, but it can also make finding optimal solutions take too many generations.

Rank selection aims to solve this problem by ranking individuals based on their fitness and then using each individual’s rank as the value for calculating the size of its slice on the wheel. In the Knapsack Problem, this value is a number between 1 and 16, because we’re choosing among 16 individuals. Although strong individuals are more likely to be selected and weaker ones are less likely to be selected even though they are average, each individual has a fairer chance of being selected based on rank rather than exact fitness. When 16 individuals are ranked, the wheel looks slightly different from roulette-wheel selection (figure 5.3).

Figure 5.3 Example of rank selection

Figure 5.4 compares roulette-wheel selection and rank selection. It is clear that rank selection gives better-performing solutions a better chance of selection.

Figure 5.4 Roulette-wheel selection vs. rank selection

Tournament selection: Let them fight

Tournament selection plays chromosomes against one other. Tournament selection randomly chooses a set number of individuals from the population and places them in a group. This process is performed for a predetermined number of groups. The individual with the highest fitness score in each respective group is selected. The larger the group, the less diverse it is, because only one individual from each group is selected. As with rank selection, the actual fitness score of each individual is not the key factor in selecting individuals globally.

When 16 individuals are allocated to four groups, selecting only 1 individual from each group results in the choice of 4 of the strongest individuals from those groups. Then the 4 winning individuals can be paired to reproduce (figure 5.5).

Figure 5.5 Example of tournament selection

Elitism selection: Choose only the best

The elitism approach selects the best individuals in the population. Elitism is useful for retaining strong-performing individuals and eliminating the risk that they will be lost through other selection methods. The disadvantage of elitism is that the population can fall into a local best solution space and never be diverse enough to find global bests.

Elitism is often used in conjunction with roulette-wheel selection, rank selection, and tournament selection. The idea is that several elite individuals are selected to reproduce, and the rest of the population is filled with individuals by means of one of the other selection strategies (figure 5.6).

Figure 5.6 Example of elitism selection

Chapter 4 explores a problem in which including items in or excluding items from the knapsack was important. A variety of problem spaces require a different encoding because binary encoding won’t make sense. The following three sections describe these scenarios.

Real-value encoding: Working with real numbers

Consider that the Knapsack Problem has changed slightly. The problem remains choosing the most valuable items to fill the weight capacity of the knapsack. But the choice involves more than one unit of each item. As shown in table 5.1, the weights and values remain the same as the original dataset, but a quantity of each item is included. With this slight adjustment, a plethora of new solutions are possible, and one or more of those solutions may be more optimal, because a specific item can be selected more than once. Binary encoding is a poor choice in this scenario. Real-value encoding is better suited to representing the state of potential solutions.

Table 5.1 Knapsack capacity: 6,404,180 kg

Item ID

Item name

Weight (kg)

Value ($)

Quantity

1

Axe

32,252

68,674

19

2

Bronze coin

225,790

471,010

14

3

Crown

468,164

944,620

2

4

Diamond statue

489,494

962,094

9

5

Emerald belt

35,384

78,344

11

6

Fossil

265,590

579,152

6

7

Gold coin

497,911

902,698

4

8

Helmet

800,493

1,686,515

10

9

Ink

823,576

1,688,691

7

10

Jewel box

552,202

1,056,157

3

11

Knife

323,618

677,562

5

12

Long sword

382,846

833,132

13

13

Mask

44,676

99,192

15

14

Necklace

169,738

376,418

8

15

Opal badge

610,876

1,253,986

4

16

Pearls

854,190

1,853,562

9

17

Quiver

671,123

1,320,297

12

18

Ruby ring

698,180

1,301,637

17

19

Silver bracelet

446,517

859,835

16

20

Timepiece

909,620

1,677,534

7

21

Uniform

904,818

1,910,501

6

22

Venom potion

730,061

1,528,646

9

23

Wool scarf

931,932

1,827,477

3

24

Crossbow

952,360

2,068,204

1

25

Yesteryear book

926,023

1,746,556

7

26

Zinc cup

978,724

2,100,851

2

Real-value encoding at its core

Real-value encoding represents a gene in terms of numeric values, strings, or symbols, and expresses potential solutions in the natural state respective to the problem. This encoding is used when potential solutions contain continuous values that cannot be encoded easily with binary encoding. As an example, because more than one item is available to be carried in the knapsack, each item index cannot indicate only whether the item is included; it must indicate the quantity of that item in the knapsack (figure 5.7).

Figure 5.7 Example of real-value encoding

Because the encoding scheme has been changed, new crossover and mutation options become available. The crossover approaches discussed for binary encoding are still valid options to real-value encoding, but mutation should be approached differently.

Arithmetic crossover: Reproduce with math

Arithmetic crossover involves an arithmetic operation to be computed by using each parent as variables in the expression. The result of applying an arithmetic operation using both parents is the new offspring. When we use this strategy with binary encoding, it is important to ensure that the result of the operation is still a valid chromosome. Arithmetic crossover is applicable to binary encoding and real-value encoding (figure 5.8).

NOTE Be wary: this approach can create very diverse offspring, which can be problematic.

Figure 5.8 Example of arithmetic crossover

Boundary mutation

In boundary mutation, a gene randomly selected from a real-value encoded chromosome is set randomly to a lower bound value or upper bound value. Given 26 genes in a chromosome, a random index is selected, and the value is set to either a minimum value or a maximum value. In figure 5.9, the original value happens to be 0 and will be adjusted to 6, which is the maximum for that item. The minimum and maximum can be the same for all indexes or set uniquely for each index if knowledge of the problem informs the decision. This approach attempts to evaluate the impact of individual genes on the chromosome.

Figure 5.9 Example of boundary mutation

Arithmetic mutation

In arithmetic mutation, a randomly selected gene in a real-value-encoded chromosome is changed by adding or subtracting a small number. Note that although the example in figure 5.10 includes whole numbers, the numbers could be decimal numbers, including fractions.

Figure 5.10 Example of arithmetic mutation

Order encoding: Working with sequences

We still have the same items as in the Knapsack Problem. We won’t be determining the items that will fit into a knapsack; instead, all the items need to be processed in a refinery in which each item is broken down to extract its source material. Perhaps the gold coin, silver bracelet, and other items are smelted to extract only the source compounds. In this scenario, items are not selected to be included, but all are included.

To make things interesting, the refinery requires a steady rate of extraction, given the extraction time and the value of the item. It’s assumed that the value of the refined material is more or less the same as the value of the item. The problem becomes an ordering problem. In what order should the items be processed to maintain a constant rate of value? Table 5.2 describes the items with their respective extraction times.

Table 5.2 Factory value per hour: 600,000

Item ID

Item name

Weight (kg)

Value ($)

Extraction time

1

Axe

32,252

68,674

60

2

Bronze coin

225,790

471,010

30

3

Crown

468,164

944,620

45

4

Diamond statue

489,494

962,094

90

5

Emerald belt

35,384

78,344

70

6

Fossil

265,590

579,152

20

7

Gold coin

497,911

902,698

15

8

Helmet

800,493

1,686,515

20

9

Ink

823,576

1,688,691

10

10

Jewel box

552,202

1,056,157

40

11

Knife

323,618

677,562

15

12

Long sword

382,846

833,132

60

13

Mask

44,676

99,192

10

14

Necklace

169,738

376,418

20

15

Opal badge

610,876

1,253,986

60

16

Pearls

854,190

1,853,562

25

17

Quiver

671,123

1,320,297

30

18

Ruby ring

698,180

1,301,637

70

19

Silver bracelet

446,517

859,835

50

20

Timepiece

909,620

1,677,534

45

21

Uniform

904,818

1,910,501

5

22

Venom potion

730,061

1,528,646

5

23

Wool scarf

931,932

1,827,477

5

24

Crossbow

952,360

2,068,204

25

25

Yesteryear book

926,023

1,746,556

5

26

Zinc cup

978,724

2,100,851

10

Importance of the fitness function

With the change in the Knapsack Problem to the Refinery Problem, a key difference is the measurement of successful solutions. Because the factory requires a constant minimum rate of value per hour, the accuracy of the fitness function used becomes paramount to finding optimal solutions. In the Knapsack Problem, the fitness of a solution is trivial to compute, as it involves only two things: ensuring that the knapsack’s weight limit is respected and summing the selected items’ value. In the Refinery Problem, the fitness function must calculate the rate of value provided, given the extraction time for each item as well as the value of each item. This calculation is more complex, and an error in the logic of this fitness function directly influences the quality of solutions.

Order encoding at its core

Order encoding, also known as permutation encoding, represents a chromosome as a sequence of elements. Order encoding usually requires all elements to be present in the chromosome, which implies that corrections might need to be made when performing crossover and mutation to ensure that no elements are missing or duplicated. Figure 5.11 depicts how a chromosome represents the order of processing of the available items.

Figure 5.11 Example of order encoding

Another example in which order encoding is sensible is representing potential solutions to route optimization problems. Given a certain number of destinations, each of which must be visited at least once while minimizing the total distance traveled, the route can be represented as a string of the destinations in the order in which they are visited. We will use this example when covering swarm intelligence in chapter 6.

Order mutation: Order/permutation encoding

In order mutation, two randomly selected genes in an order-encoded chromosome swap positions, ensuring that all items remain in the chromosome while introducing diversity (figure 5.12).

Figure 5.12 Example of order mutation

Tree encoding: Working with hierarchies

The preceding sections show that binary encoding is useful for selecting items from a set, real-value encoding is useful when real numbers are important to the solution, and order encoding is useful for determining priority and sequences. Suppose that the items in the Knapsack Problem are placed in packages to be shipped to homes around the town. Each delivery wagon can hold a specific volume. The requirement is to determine the optimal positioning of packages to minimize empty space in each wagon (table 5.3).

Table 5.3 Wagon capacity: 1000 wide × 1000 high

Item ID

Item name

Weight (kg)

Value ($)

W

H

1

Axe

32,252

68,674

20

60

2

Bronze coin

225,790

471,010

10

10

3

Crown

468,164

944,620

20

20

4

Diamond statue

489,494

962,094

30

70

5

Emerald belt

35,384

78,344

30

20

6

Fossil

265,590

579,152

15

15

7

Gold coin

497,911

902,698

10

10

8

Helmet

800,493

1,686,515

40

50

9

Ink

823,576

1,688,691

5

10

10

Jewel box

552,202

1,056,157

40

30

11

Knife

323,618

677,562

10

30

12

Long sword

382,846

833,132

15

50

13

Mask

44,676

99,192

20

30

14

Necklace

169,738

376,418

15

20

15

Opal badge

610,876

1,253,986

5

5

16

Pearls

854,190

1,853,562

10

5

17

Quiver

671,123

1,320,297

30

70

18

Ruby ring

698,180

1,301,637

5

10

19

Silver bracelet

446,517

859,835

10

20

20

Timepiece

909,620

1,677,534

15

20

21

Uniform

904,818

1,910,501

30

40

22

Venom potion

730,061

1,528,646

15

15

23

Wool scarf

931,932

1,827,477

20

30

24

Crossbow

952,360

2,068,204

50

70

25

Yesteryear book

926,023

1,746,556

25

30

26

Zinc cup

978,724

2,100,851

15

25

In the interest of simplicity, suppose that the wagon’s volume is a two-dimensional rectangle and that the packages are rectangular rather than 3D boxes.

Tree encoding at its core

Tree encoding represents a chromosome as a tree of elements. Tree encoding is versatile for representing potential solutions in which the hierarchy of elements is important and/or required. Tree encoding can even represent functions, which consist of a tree of expressions. As a result, tree encoding could be used to evolve program functions in which the function solves a specific problem; the solution may work but look bizarre.

Here is an example in which tree encoding makes sense. We have a wagon with a specific height and width, and a certain number of packages must fit in the wagon. The goal is to fit the packages in the wagon so that empty space is minimized. A tree-encoding approach would work well in representing potential solutions to this problem.

In figure 5.13, the root node, node A, represents the packing of the wagon from top to bottom. Node B represents all packages horizontally, similarly to node C and node D. Node E represents packages packed vertically in its slice of the wagon.

Figure 5.13 Example of a tree used to represent the Wagon Packing Problem

Tree crossover: Inheriting portions of a tree

Tree crossover is similar to single-point crossover (chapter 4) in that a single point in the tree structure is selected and then the parts are exchanged and combined with copies of the parent individuals to create an offspring individual. The inverse process can be used to make a second offspring. The resulting children must be verified to be valid solutions that obey the constraints of the problem. More than one point can be used for crossover if using multiple points makes sense in solving the problem (figure 5.14).

Figure 5.14 Example of tree crossover

Change node mutation: Changing the value of a node

In change node mutation, a randomly selected node in a tree-encoded chromosome is changed to a randomly selected valid object for that node. Given a tree representing an organization of items, we can change an item to another valid item (figure 5.15).

Figure 5.15 Change node mutation in a tree

This chapter and chapter 4 cover several encoding schemes, crossover schemes, and selection strategies. You could substitute your own approaches for these steps in your genetic algorithms if doing so makes sense for the problem you’re solving.

Common types of evolutionary algorithms

This chapter focuses on the life cycle and alternative approaches for a genetic algorithm. Variations of the algorithm can be useful for solving different problems. Now that we have a grounding in how a genetic algorithm works, we’ll look at these variations and possible use cases for them.

Genetic programming

Genetic programming follows a process similar to that of genetic algorithms but is used primarily to generate computer programs to solve problems. The process described in the previous section also applies here. The fitness of potential solutions in a genetic programming algorithm is how well the generated program solves a computational problem. With this in mind, we see that the tree-encoding method would work well here, because most computer programs are graphs consisting of nodes that indicate operations and processes. These trees of logic can be evolved, so the computer program will be evolved to solve a specific problem. One thing to note: these computer programs usually evolve to look like a mess of code that’s difficult for people to understand and debug.

Evolutionary programming

Evolutionary programming is similar to genetic programming, but the potential solution is parameters for a predefined fixed computer program, not a generated computer program. If a program requires finely tuned inputs, and determining a good combination of inputs is difficult, a genetic algorithm can be used to evolve these inputs. The fitness of potential solutions in an evolutionary programming algorithm is determined by how well the fixed computer program performs based on the parameters encoded in an individual. Perhaps an evolutionary programming approach could be used to find good parameters for an artificial neural network (chapter 9).

Glossary of evolutionary algorithm terms

Here is a useful glossary of evolutionary algorithms terms for future research and learning:

  • Allele—The value of a specific gene in a chromosome

  • Chromosome—A collection of genes that represents a possible solution

  • Individual—A single chromosome in a population

  • Population—A collection of individuals

  • Genotype—The artificial representation of the potential solution population in the computation space

  • Phenotype—The actual representation of the potential solution population in the real world

  • Generation—A single iteration of the algorithm

  • Exploration—The process of finding a variety of possible solutions, some of which may be good and some of which may be bad

  • Exploitation—The process of honing in on good solutions and iteratively refining them

  • Fitness function—A particular type of objective function

  • Objective function—A function that attempts to maximize or minimize

More use cases for evolutionary algorithms

Some of the use cases for evolutionary algorithms are listed in chapter 4, but many more exist. The following use cases are particularly interesting because they use one or more of the concepts discussed in this chapter:

  • Adjusting weights in artificial neural networks—Artificial neural networks are discussed later, in chapter 9, but a key concept is adjusting weights in the network to learn patterns and relationships in data. Several mathematical techniques adjust weights, but evolutionary algorithms are more efficient alternatives in the right scenarios.

  • Electronic circuit design—Electronic circuits with the same components can be designed in many configurations. Some configurations are more efficient than others. If two components that work together often are closer together, this configuration may improve efficiency. Evolutionary algorithms can be used to evolve different circuit configurations to find the most optimal design.

  • Molecular structure simulation and design—As in electronic circuit design, different molecules behave differently and have their own advantages and disadvantages. Evolutionary algorithms can be used to generate different molecular structures to be simulated and studied to determine their behavioral properties.

Now that we’ve been through the general genetic algorithm life cycle in chapter 4 and some advanced approaches in this chapter, you should be equipped to apply evolutionary algorithms in your contexts and solutions.

Summary of advanced evolutionary approaches

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.170.187