Studying of NP-complete problems is for simplicity restricted to the problems, where the answer can be yes or no. Because there are tasks with complicated outputs, a class of problems called NP-hard problems has been introduced. This class is not as limited as class of NP-complete problems.
For NP-problems is characteristic that some simple algorithm to find a solution is obvious at a first sight - just trying all possible solutions. But this algorithm is very slow (usually O(2^n)) and even for a bit bigger instances of the problems it is not usable at all.
Today nobody knows if some faster exact algorithm exists. Proving or disproving this remains as a big task for new researchers (and maybe you! :-)). Today many people think, that such an algorithm does not exist and so they are looking for some alternative methods - example of these methods are genetic algorithms.
Examples of the NP problems are satisfiability problem, travelling salesman problem or knapsack problem. Compendium of NP problems is .
III. Genetic Algorithm
Basic Description
Genetic algorithms are inspired by Darwin's theory about evolution. Solution to a problem solved by genetic algorithms is evolved.
Algorithm is started with a set of solutions (represented by chromosomes) called population. Solutions from one population are taken and used to form a new population. This is motivated by a hope, that the new population will be better than the old one. Solutions which are selected to form new solutions (offspring) are selected according to their fitness - the more suitable they are the more chances they have to reproduce.
This is repeated until some condition (for example number of populations or improvement of the best solution) is satisfied.
Outline of the Basic Genetic Algorithm
-
[Start] Generate random population of n chromosomes (suitable solutions for the problem)
-
[Fitness] Evaluate the fitness f(x) of each chromosome x in the population
-
[New population] Create a new population by repeating following steps until the new population is complete
-
[Selection] Select two parent chromosomes from a population according to their fitness (the better fitness, the bigger chance to be selected)
-
[Crossover] With a crossover probability cross over the parents to form a new offspring (children). If no crossover was performed, offspring is an exact copy of parents.
-
[Mutation] With a mutation probability mutate new offspring at each locus (position in chromosome).
-
[Accepting] Place new offspring in a new population
-
[Replace] Use new generated population for a further run of algorithm
-
[Test] If the end condition is satisfied, stop, and return the best solution in current population
-
[Loop] Go to step 2
IV. Operators of GA
Overview
As you can see from the , the crossover and mutation are the most important part of the genetic algorithm. The performance is influenced mainly by these two operators. Before we can explain more about crossover and mutation, some information about chromosomes will be given.
Encoding of a Chromosome
The chromosome should in some way contain information about solution which it represents. The most used way of encoding is a binary string. The chromosome then could look like this:
Each chromosome has one binary string. Each bit in this string can represent some characteristic of the solution. Or the whole string can represent a number - this has been used in the basic .
Crossover
After we have decided what encoding we will use, we can make a step to crossover. Crossover selects genes from parent chromosomes and creates a new offspring. The simplest way how to do this is to choose randomly some crossover point and everything before this point point copy from a first parent and then everything after a crossover point copy from the second parent.
Crossover can then look like this ( | is the crossover point):
Mutation
After a crossover is performed, mutation take place. This is to prevent falling all solutions in population into a local optimum of solved problem. Mutation changes randomly the new offspring. For binary encoding we can switch a few randomly chosen bits from 1 to 0 or from 0 to 1. Mutation can then be following:
V. Parameters of GA
Crossover and Mutation Probability
There are two basic parameters of GA - crossover probability and mutation probability.
Crossover probability says how often will be crossover performed. If there is no crossover, offspring is exact copy of parents. If there is a crossover, offspring is made from parts of parents' chromosome. If crossover probability is 100%, then all offspring is made by crossover. If it is 0%, whole new generation is made from exact copies of chromosomes from old population (but this does not mean that the new generation is the same!).
Crossover is made in hope that new chromosomes will have good parts of old chromosomes and maybe the new chromosomes will be better. However it is good to leave some part of population survive to next generation.
Mutation probability says how often will be parts of chromosome mutated. If there is no mutation, offspring is taken after crossover (or copy) without any change. If mutation is performed, part of chromosome is changed. If mutation probability is100%, whole chromosome is changed, if it is 0%, nothing is changed.
Mutation is made to prevent falling GA into local extreme, but it should not occur very often, because then GA will in fact change to random search.
There are also some other parameters of GA. One also important parameter is population size.
Population size says how many chromosomes are in population (in one generation). If there are too few chromosomes, GA have a few possibilities to perform crossover and only a small part of search space is explored. On the other hand, if there are too many chromosomes, GA slows down. Research shows that after some limit (which depends mainly on encoding and the problem) it is not useful to increase population size, because it does not make solving the problem faster.
VI. Selection
Introduction
As you already know from the , chromosomes are selected from the population to be parents to crossover. The problem is how to select these chromosomes. According to Darwin's evolution theory the best ones should survive and create new offspring. There are many methods how to select the best chromosomes, for example roulette wheel selection, Boltzman selection, tournament selection, rank selection, steady state selection and some others.
Roulette Wheel Selection
Parents are selected according to their fitness. The better the chromosomes are, the more chances to be selected they have. Imagine a roulette wheel where are placed all chromosomes in the population, every has its place big accordingly to its fitness function, like on the following picture.
Then a marble is thrown there and selects the chromosome. Chromosome with bigger fitness will be selected more times.
This can be simulated by following algorithm.
-
[Sum] Calculate sum of all chromosome fitnesses in population - sum S.
-
[Select] Generate random number from interval (0,S) - r.
-
[Loop] Go through the population and sum fitnesses from 0 - sum s. When the sum s is greater then r, stop and return the chromosome where you are.
Of course, step 1 is performed only once for each population.
Rank Selection
The previous selection will have problems when the fitnesses differs very much. For example, if the best chromosome fitness is 90% of all the roulette wheel then the other chromosomes will have very few chances to be selected.
Rank selection first ranks the population and then every chromosome receives fitness from this ranking. The worst will have fitness 1, second worst 2 etc. and the best will have fitness N (number of chromosomes in population).
You can see in following picture, how the situation changes after changing fitness to order number.
Situation before ranking (graph of fitnesses)
Situation after ranking (graph of order numbers)
After this all the chromosomes have a chance to be selected. But this method can lead to slower convergence, because the best chromosomes do not differ so much from other ones.
Steady-State Selection
This is not particular method of selecting parents. Main idea of this selection is that big part of chromosomes should survive to next generation.
GA then works in a following way. In every generation are selected a few (good - with high fitness) chromosomes for creating a new offspring. Then some (bad - with low fitness) chromosomes are removed and the new offspring is placed in their place. The rest of population survives to new generation.
Elitism
Idea of elitism has been already introduced. When creating new population by crossover and mutation, we have a big chance, that we will lose the best chromosome.
Elitism is name of method, which first copies the best chromosome (or a few best chromosomes) to new population. The rest is done in classical way. Elitism can very rapidly increase performance of GA, because it prevents losing the best found solution.
VII. Encoding
Introduction
Encoding of chromosomes is one of the problems, when you are starting to solve problem with GA. Encoding very depends on the problem.
In this chapter will be introduced some encodings, which have been already used with some success.
Binary Encoding
Binary encoding is the most common, mainly because first works about GA used this type of encoding.
In binary encoding every chromosome is a string of bits, 0 or 1.
Example of chromosomes with binary encoding
Binary encoding gives many possible chromosomes even with a small number of alleles. On the other hand, this encoding is often not natural for many problems and sometimes corrections must be made after crossover and/or mutation.
Example of Problem: Knapsack problem
The problem: There are things with given value and size. The knapsack has given capacity. Select things to maximize the value of things in knapsack, but do not extend knapsack capacity.
Encoding: Each bit says, if the corresponding thing is in knapsack.
Permutation Encoding
Permutation encoding can be used in ordering problems, such as travelling salesman problem or task ordering problem.
In permutation encoding, every chromosome is a string of numbers, which represents number in a sequence.
Example of chromosomes with permutation encoding
Permutation encoding is only useful for ordering problems. Even for this problems for some types of crossover and mutation corrections must be made to leave the chromosome consistent (i.e. have real sequence in it).
Example of Problem: Travelling salesman problem (TSP)
The problem: There are cities and given distances between them.Travelling salesman has to visit all of them, but he does not to travel very much. Find a sequence of cities to minimize travelled distance.
Encoding: Chromosome says order of cities, in which salesman will visit them.
Value Encoding
Direct value encoding can be used in problems, where some complicated value, such as real numbers, are used. Use of binary encoding for this type of problems would be very difficult.
In value encoding, every chromosome is a string of some values. Values can be anything connected to problem, form numbers, real numbers or chars to some complicated objects.
Example of chromosomes with value encoding
Value encoding is very good for some special problems. On the other hand, for this encoding is often necessary to develop some new crossover and mutation specific for the problem.
Example of Problem: Finding weights for neural network
The problem: There is some neural network with given architecture. Find weights for inputs of neurons to train the network for wanted output.
Encoding: Real values in chromosomes represent corresponding weights for inputs.
Tree Encoding
Tree encoding is used mainly for evolving programs or expressions, for genetic programming.
In tree encoding every chromosome is a tree of some objects, such as functions or commands in programming language.
Example of chromosomes with tree encoding
Tree encoding is good for evolving programs. Programing language LISP is often used to this, because programs in it are represented in this form and can be easily parsed as a tree, so the crossover and mutation can be done relatively easily.
Example of Problem: Finding a function from given values
The problem: Some input and output values are given. Task is to find a function, which will give the best (closest to wanted) output to all inputs.
Encoding: Chromosome are functions represented in a tree.
VIII. Crossover and Mutation
Introduction
Crossover and mutation are two basic operators of GA. Performance of GA very depends on them. Type and implementation of operators depends on encoding and also on a problem.
There are many ways how to do crossover and mutation. In this chapter are only some examples and suggestions how to do it for .
Binary Encoding
Crossover
Single point crossover - one crossover point is selected, binary string from beginning of chromosome to the crossover point is copied from one parent, the rest is copied from the second parent
11001011+11011111 = 11001111
Two point crossover - two crossover point are selected, binary string from beginning of chromosome to the first crossover point is copied from one parent, the part from the first to the second crossover point is copied from the second parent and the rest is copied from the first parent
11001011 + 11011111 = 11011111
Uniform crossover - bits are randomly copied from the first or from the second parent
11001011 + 11011101 = 11011111
Arithmetic crossover - some arithmetic operation is performed to make a new offspring
11001011 + 11011111 = 11001001 (AND)
Mutation
Bit inversion - selected bits are inverted
11001001 => 10001001
Permutation Encoding
Crossover
Single point crossover - one crossover point is selected, till this point the permutation is copied from the first parent, then the second parent is scanned and if the number is not yet in the offspring it is added
Note: there are more ways how to produce the rest after crossover point
(1 2 3 4 5 6 7 8 9) + (4 5 3 6 8 9 7 2 1) = (1 2 3 4 5 6 8 9 7)
Mutation
Order changing - two numbers are selected and exchanged
(1 2 3 4 5 6 8 9 7) => (1 8 3 4 5 6 2 9 7)
Value Encoding
Crossover
All crossovers from can be used
Mutation
Adding a small number (for real value encoding) - to selected values is added (or subtracted) a small number
(1.29 5.68 2.86 4.11 5.55) => (1.29 5.68 2.73 4.22 5.55)
Tree Encoding
Crossover
Tree crossover - in both parent one crossover point is selected, parents are divided in that point and exchange part below crossover point to produce new offspring
Mutation
Changing operator, number - selected nodes are changed