[PYTHON] Try to solve the traveling salesman problem with a genetic algorithm (Theory)

Introduction

I tried to make a lot of hits by googled with "Traveling salesman problem genetic algorithm".

-Theory -Python code -Execution result

Overview

Traveling Salesman Problem

The ** Traveling Salesman Problem ** is the problem of finding the shortest route back to the original point, going through all $ N $ points once. The total number of routes that pass through all $ N $ points and return is $ (N-1)! / 2 $. If there are 3 points, there is one way. If there are 4 points, there are 3 ways. If there are 5 points, there are 12 ways. As the number of points increases, the total number of routes will increase explosively. It takes an enormous amount of time to find the shortest route among them.

However, if conditional, you can find the route in a practical amount of time.

For example, if you know that a point is at the apex of a convex polygon, the shortest path is that convex polygon. In addition, there are various search methods such as genetic algorithms, annealing methods, and self-organizing maps, as long as they are ** almost ** shortest.

Genetic algorithm

** Genetic algorithm ** is one of the search algorithms for combinatorial optimization problems, and has the motif of the evolution of living things. Since the traveling salesman problem is also one of the combination optimization problems, a genetic algorithm can be used. The procedure for searching for a genetic algorithm is as follows.

  1. First, prepare a certain number of combinations.
  2. Choose a number of pairs from existing combinations.
  3. Pairs exchange some of their combinations to create a new combination.
  4. Measure the goodness of the new combination.
  5. Leave a certain number of good combinations and discard the rest.
  6. Select a pair, discard it, and repeat until you are satisfied.

The genetic algorithm is likened to a living thing and is explained using the following words.

-** Individual ** ... Each combination. -** Chromosome ** ... A combination is expressed in an appropriate way. -** Gene ** ... It is a basic component of the chromosome. -** Goodness of fit ** ... A numerical value that measures the goodness of each individual to the problem. -** Generation ** ... A set of individuals. -** Crossing ** ... This is the procedure to generate a new individual's chromosome from the existing individual's chromosome. -** Mutation ** ... 1 Changes the chromosome of an individual to generate a new individual. -** Crossing ** ... Combining the chromosomes of two individuals to create a new individual. -** Selection ** ... Select individuals to be used for crossing from generations according to their fitness. -** Selection ** ... From the set of crossed individuals and the parent generation, discarding the individuals with low fitness and creating a certain number of individuals for the next generation.

Commentary by Shizuoka Institute of Science and Technology Suganuma Laboratory Getting started with the Genetic Algorithm! --SlideShare

Traveling salesman problem with genetic algorithm

Arrange the genetic algorithm for the traveling salesman problem.

Individual (salesman)

Let $ p_0, p_1, \ dots, p_N $ be the $ N + 1 $ points that the salesman passes through. The path starting from $ p_0 $, passing through all the points in the numerical order of the points and returning to $ p_0 $ can be expressed as $ p_0, p_1, \ dots, p_N, p_0 $. $ p_1, p_2, \ dots, p_N, p_0, p_1 $ are routes that start the same circle from $ p_1 $. Since the length of the looped route is the same regardless of which point is selected as the starting point, the starting point is fixed to $ p_0 $ and the route is expressed as $ p_ {i_1}, \ dots, p_ {i_N} $. Also, even if the whole is translated, the length of the path is the same, so set $ p_0 = O $ and move the other points in parallel. The path created in this way is an individual.

Fitness

For fitness, use the length of the path represented by the individual.

When the distance between two points $ p, p'$ is $ d (p, p') $, the length of the route $ p_ {i_1}, \ dots p_ {i_N} $ is $ d (p_0, p_ {i_1}) + d (p_0, p_ {i_N} + \ sum_ {k = 1} ^ {N-1} d (p_ {i_k}, p_ {i_ {k + 1}}) $. The smaller the length, the better the route. The larger the number, the better the fitness, so use the reciprocal length or sign inversion. This time I used sign inversion.

If the smaller the number, the better the fitness, use the length as it is.

Choice

At the time of crossing, instead of generating child generations from all individuals of the parent generation, ** selection ** is made so that individuals with higher fitness are more likely to have offspring.

Also, during selection, selection is made so that individuals with higher fitness are more likely to remain in the offspring.

There are three main choices. All have advantages and disadvantages, so use them in combination.

Elite selection

Select in descending order of fitness. Since excellent individuals remain, it converges quickly. On the other hand, the influence of excellent individuals in the early stages is strong, and with each generation there is a possibility that diversity will decrease and we will fall into a dead end of evolution (= local solution).

Roulette selection

Select based on the probability calculated from fitness. Genes with low fitness remain (although less likely), so diversity does not decrease. On the other hand, genes with high fitness may be discarded and no solution may be reached.

There are various ways to calculate the probability.

--Roulette probability ... $ (individual fitness) \ div (sum of generation fitness) $. This is effective when the fitness is positive and the variance is large. --Ranking probability ... Associates the fitness rank and probability with a ratio of 1: 1. It can be used even if the fitness is negative. The difference in fitness no longer corresponds to the difference in probability. --Function conversion ... Converts fitness to probability with a sigmoid function. Individuals with a higher fitness than the roulette probability are more likely to be selected.

Tournament selection

Randomly extract $ n $ individuals from the population and leave the one with the highest fitness. Repeat this until the required number of individuals is collected. Genes with low fitness remain stochastically, so diversity does not decrease much. On the other hand, genes with high fitness may be discarded and no solution may be reached.

Genes and chromosomes

I will introduce two ways of expressing (coding) the route.

Permutation Encoding

This is an expression method in which the dot numbers are arranged as they are. The genes are the dot numbers $ 1, \ dots, N $. Chromosomes are columns $ g_1, \ dots, g_N $ in which genes are arranged without duplication. If $ g_i = k $, it means passing through $ p_k $ in the $ i $ th. Below is an example of a route through 5 points and its permutation coding.

\{ p_4, p_1, p_3, p_5, p_2 \} \rightarrow 4, 1, 3, 5, 2

This representation always keeps $ g_i \ neq g_j (i \ neq j) $.

Ordinal coding

Ordinal coding is an arbitrary name. I didn't know the official name.

This is an expression method that uses the ordinal number of the point sequence excluding the passed points. The $ i $ th gene $ g_i $ takes up to $ N-i + 1 $, the number of remaining points. When there is a sequence of points $ P_1 = p_1, p_2, \ dots, p_N $, the procedure for determining $ g_i $ is as follows.

  1. Let the point sequence $ 1, 2, \ dots, N $ be $ P_1 $.
  2. Find out what position the first passing point $ p_i $ is in the point sequence $ P_1 $. If it is in the $ i'$ position, then $ g_1 = i'$. Create a sequence of points $ P_2 $ from $ P_1 $ minus $ p_i $.
  3. Find out what position the second passing point $ p_j $ is in the point sequence $ P_2 $. If it is in the $ j'$ position, then $ g_2 = j'$. Create a sequence of points $ P_3 $ from $ P_2 $ minus $ p_3 $.
  4. Continue in the same way.

Below is an example of a path through 5 points and its ordinal coding.

\{ p_4, p_1, p_3, p_5, p_2 \} \rightarrow 4, 1, 2, 2, 1 \\
\begin{array}{lll}
p_1, p_2, p_3, p_4, p_5 & -, -, -, -, - &Remove points while recording the position in the sequence\\
p_1, p_2, p_3, p_5 & 4, -, -, -, - & p_4 is the 4th\\
p_2, p_3, p_5 & 4, 1, -, -, - & p_1 is the first\\
p_2, p_5 & 4, 1, 2, -, - & p_3 is the second\\
p_2 & 4, 1, 2, 2, - & p_5 is the second\\
 & 4, 1, 2, 2, 1 & p_2 is the first
\end{array}

With this representation, $ g_i \ le N-i + 1 $ is always kept. Instead, the same number may appear in the chromosome ($ g_i = g_j (i \ neq j) $). The correspondence between the gene $ g_i $ and the dots is not intuitive and can only be found by examining $ g_1, \ dots, g_ {i-1} $.

Crossbreed

In crossing of organisms, it is common to exchange paired chromosomes or exchange corresponding sections between chromosomes. In addition, there are various mutations because there are no restrictions on the position of genes and the length of chromosomes is flexible.

Since the salesman's chromosome has coding-dependent restrictions, we define crossovers and mutations within that range.

――Since the restriction of permutation coding is that each gene is once, it is necessary to operate so that there is no duplication in the crossing of chromosomes (fortunately, smart people have devised crossings for genes in order. ). On the other hand, exchanging genes in a chromosome is easy to describe because it does not break the constraints. ――In ordinal coding, restrictions are determined by the position of genes, so it is easier to exchange genes at the same position between chromosomes or simply change genes, rather than exchanging gene positions. Can be described.

Permutation coding crossover

Cycle Crossover (CX)

From the two chromosomes, find the group with the same set of points and the same set of positions, and exchange the groups.

  1. Select the random gene $ q_1 $.
  2. When $ q_i $ is fixed, find $ q_i $ from parent 1 and set its position to $ k_ {i + 1} $.
  3. Let $ q_ {i + 1} $ be the point represented by the $ k_ {i + 1} $ th gene $ h_ {k_ {i + 1}} $ of parent 2.
  4. Exit if $ q_ {i + 1} = q_1 $. Otherwise it repeats for $ q_ {i + 1} $.
\begin{array}{cc}
Parent 1& 1\acute{2}345678 \\
Parent 2& 6\grave{7}852341
\end{array}
\rightarrow
\begin{array}{c}
1\acute{2}3456\acute{7}8 \\
6\grave{7}8523\grave{4}1
\end{array}
\rightarrow \dots \rightarrow
\begin{array}{c}
1\acute{2}3\acute{4}\acute{5}6\acute{7}8 \\
6\grave{7}8\grave{5}\grave{2}3\grave{4}1
\end{array}
\rightarrow
\begin{array}{cc}
Child 1& 1\hat{7}3\hat{5}\hat{2}6\hat{4}8 \\
Child 2& 6\hat{2}8\hat{4}\hat{5}3\hat{7}1
\end{array}

Partial Crossover

Choose a random position $ r $ and look at the genes $ {g_1} _r, {g_2} _r $ at that position in parent 1 and parent 2. The genes $ {g_1} _r and {g_2} _r $ are exchanged between parent 1 and parent 2, respectively. Repeat this several times to generate the chromosomes for child 1 and child 2.

\begin{array}{cc}
Parent 1& 12\acute{3}4567\grave{8} \\
Parent 2& 67\grave{8}52\acute{3}41
\end{array}
\rightarrow
\begin{array}{cc}
Child 1& 12\acute{8}4567\acute{3} \\
Child 2& 67\grave{3}52\grave{8}41
\end{array}

Oder Crossover (OX)

Parent 1 inherits the random section of the chromosome as is to the child. Parent 2 fills in the blanks by taking over the missing genes in the offspring so that they are not out of order.

\begin{array}{cc}
Parent 1& \acute{1}\acute{2}\acute{3}|456|\acute{7}\acute{8} \\
Parent 2& 6\grave{7}\grave{8}5\grave{2}\grave{3}4\grave{1}
\end{array}
\rightarrow
\begin{array}{cc}
Child& \grave{7}\grave{8}\grave{2}|456|\grave{3}\grave{1}
\end{array}

Uniform Based Crossover (OX2)

Determine some random positions $ r_1, \ dots, r_k $ and look at the genes $ g_ {r_1}, \ dots, g_ {r_k} $ at those positions in parent 1. Sort those genes in parent 2 $ g_ {r_1}, \ dots, g_ {r_k} $ in the order in parent 1 to generate chromosomes in child 1. Swap parent 1 and parent 2 and do the same to generate child 2.

\begin{array}{cc}
Parent 1& 1\acute{2}3\acute{4}\acute{5}67\acute{8} \\
Parent 2& 67\grave{8}\grave{5}\grave{2}3\grave{4}1
\end{array}
\rightarrow
\begin{array}{cc}
Child 1& 67\acute{2}\acute{4}\acute{5}3\acute{8}1
\end{array}

Uniform Based Crossover (POX)

Determine some random positions $ r_1, \ dots, r_k $. Child 1 inherits the genes $ g_ {r_1}, \ dots, g_ {r_k} $ at those positions from parent 1 as is. The missing genes are inherited from parent 2 in a consistent manner. Swap parent 1 and parent 2 and do the same to generate child 2.

\begin{array}{cc}
Parent 1& 1\acute{2}3\acute{4}\acute{5}67\acute{8} \\
Parent 2& \grave{6}\grave{7}852\grave{3}4\grave{1}
\end{array}
\rightarrow
\begin{array}{cc}
Child& \grave{6}\acute{2}\grave{7}\acute{4}\acute{5}\grave{3}\grave{1}\acute{8}
\end{array}

If there is only one child, uniform order crossing and uniform position crossing are equivalent.

Sub tour Exchange Crossover (SXX)

It searches two chromosomes for a section in which the same set of genes is lined up, and exchanges the sections to generate children 1 and 2.

\begin{array}{cc}
Parent 1& 1\acute{2}\acute{3}\acute{4}\acute{5}678 \\
Parent 2& 678\grave{5}\grave{3}\grave{2}\grave{4}1
\end{array}
\rightarrow
\begin{array}{cc}
Child 1& 1\grave{5}\grave{3}\grave{2}\grave{4}678 \\
Child 2& 678\acute{2}\acute{3}\acute{4}\acute{5}1
\end{array}

Other crossovers

--Partial Mapped Crossover (PMX) --Edge Recombination Crossover (ERX) --Edge Assembly Crossover (EAX)

Permutation coding mutations

Swap

Genes are exchanged for each interval of two randomly selected intervals.

\begin{array}{cc}
parent& 1|\acute{2}\acute{3}\acute{4}|56|\grave{7}\grave{8}|
\end{array}
\rightarrow
\begin{array}{cc}
Child& 1|\grave{7}\grave{8}|56|\acute{2}\acute{3}\acute{4}|
\end{array}

Translocation

Moves a randomly selected section to another position.

\begin{array}{cc}
parent& 1|\acute{2}\acute{3}\acute{4}\acute{5}|678
\end{array}
\rightarrow
\begin{array}{cc}
Child& 167|\acute{2}\acute{3}\acute{4}\acute{5}|8
\end{array}

Move position

Randomly select two genes, $ g_1, g_2 $, and move $ g_1 $ before $ g_2 $. A special form of translocation.

\begin{array}{cc}
parent& 1\acute{2}3456\grave{7}8
\end{array}
\rightarrow
\begin{array}{cc}
Child& 1\grave{7}\acute{2}34568
\end{array}

Inversion

Inverts the order of genes for randomly selected intervals.

\begin{array}{cc}
parent& 1|\acute{2}\acute{3}\acute{4}|5678
\end{array}
\rightarrow
\begin{array}{cc}
Child& 1|\acute{4}\acute{3}\acute{2}|5678
\end{array}

Scramble

Randomly rearrange the order of genes for randomly selected intervals.

\begin{array}{cc}
parent& 1|\acute{2}\acute{3}\acute{4}\acute{5}|678
\end{array}
\rightarrow
\begin{array}{cc}
Child& 1|\acute{4}\acute{2}\acute{5}\acute{3}|678
\end{array}

Ordinal coding crossover

Single Point Crossover

Two chromosomes are cut and exchanged at one random location.

\begin{array}{cc}
Parent 1& g_1 g_2 g_3 | g_4 g_5 g_6 \\
Parent 2& h_1 h_2 h_3 | h_4 h_5 h_6
\end{array}
\rightarrow
\begin{array}{cc}
Child 1& g_1 g_2 g_3 | h_4 h_5 h_6 \\
Child 2& h_1 h_2 h_3 | g_4 g_5 g_6
\end{array}

2 Points Crossover

Swap the sections where two chromosomes are cut at two random points.

\begin{array}{cc}
Parent 1& g_1 g_2 | g_3 g_4 g_5 | g_6 \\
Parent 2& h_1 h_2 | h_3 h_4 h_5 | h_6
\end{array}
\rightarrow
\begin{array}{cc}
Child 1& g_1 g_2 | h_3 h_4 h_5 | g_6 \\
Child 2& h_1 h_2 | g_3 g_4 g_5 | h_6
\end{array}

Multi Points Crossover

Swap the sections where two chromosomes are cut at several random points.

\begin{array}{cc}
Parent 1& g_1 | g_2 g_3 | g_4 g_5 | g_6 \\
Parent 2& h_1 | h_2 h_3 | h_4 h_5 | h_6
\end{array}
\rightarrow
\begin{array}{cc}
Child 1& g_1 | h_2 h_3 | g_4 g_5 | h_6 \\
Child 2& h_1 | g_2 g_3 | h_4 h_5 | g_6
\end{array}

Uniform Crossover

The genes of each of the two chromosomes are exchanged with an appropriate probability.

\begin{array}{cc}
Parent 1& g_1 g_2 g_3 g_4 g_5 g_6 \\
Parent 2& h_1 h_2 h_3 h_4 h_5 h_6
\end{array}
\rightarrow
\begin{array}{cc}
Child 1& g_1 h_2 g_3 h_4 h_5 g_6 \\
Child 2& h_1 g_2 h_3 g_4 g_5 h_6
\end{array}

Ordinal coding mutations

Substitution

Replaces a gene at a random position with another gene.

\begin{array}{cc}
parent& g_1 g_2 g_3 g_4 g_5 g_6
\end{array}
\rightarrow
\begin{array}{cc}
Child& g_1 h_2 g_3 h_4 g_5 g_6
\end{array}

Newly generated

It produces entirely new individuals, not crossovers or mutations.

in conclusion

I was planning to publish the program by briefly introducing the genetic algorithm / traveling salesman problem, but since the introduction of the genetic algorithm has become a long sentence, I will write the program in a separate article.

Recommended Posts

Try to solve the traveling salesman problem with a genetic algorithm (Theory)
Try to solve the traveling salesman problem with a genetic algorithm (Python code)
I tried "Implementing a genetic algorithm (GA) in python to solve the traveling salesman problem (TSP)"
Finding a solution to the N-Queen problem with a genetic algorithm (2)
Finding a solution to the N-Queen problem with a genetic algorithm (1)
Solve the traveling salesman problem with OR-Tools
Try to solve the fizzbuzz problem with Keras
Try to solve the internship assignment problem with Python
The story of the algorithm drawing a ridiculous conclusion when trying to solve the traveling salesman problem properly
Solving the traveling salesman problem with the genetic algorithm (GA) and its library (vcopt)
Solve the Python asymmetric traveling salesman problem with a branch and bound method
Solving the nurse scheduling problem (shift optimization) with a genetic algorithm
Try to solve the N Queens problem with SA of PyQUBO
I wanted to solve the ABC164 A ~ D problem with Python
Try to solve the Python class inheritance problem
Solve a simple traveling salesman problem using a Boltzmann machine with simulated annealing
Try to solve the man-machine chart with Python
How to try the friends-of-friends algorithm with pyfof
How to fix the initial population with a genetic algorithm using DEAP
Try to solve a set problem of high school math with Python
I tried to implement the traveling salesman problem
I tried to solve the problem with Python Vol.1
About the traveling salesman problem
[Python] Try optimizing FX systole parameters with a genetic algorithm
I tried to solve a combination optimization problem with Qiskit
A story about how to deal with the CORS problem
Try to model a multimodal distribution using the EM algorithm
About the Ordered Traveling Salesman Problem
Let's write a program to solve the 4x4x4 Rubik's Cube! 2. Algorithm
Find the optimal value of a function with a genetic algorithm (Part 2)
Solve the Python knapsack problem with a branch and bound method
Try to solve the shortest path with Python + NetworkX + social data
Solve the subset sum problem with a full search in Python
Try to solve the function minimization problem using particle swarm optimization
Want to solve a simple classification problem?
Stack problem: Try to solve "20. Valid Parentheses"
Try the Variational-Quantum-Eigensolver (VQE) algorithm with Blueqat
Search the maze with the python A * algorithm
[AtCoder] Solve ABC1 ~ 100 A problem with Python
Python: I tried the traveling salesman problem
The 19th offline real-time how to write reference problem to solve with Python
I want to solve the problem of memory leak when outputting a large number of images with Matplotlib
[AtCoder] Solve A problem of ABC101 ~ 169 with Python
Try to calculate a statistical problem in Python
A simple workaround for bots to try to post tweets with the same content
Try to make a "cryptanalysis" cipher with Python
Save the object to a file with pickle
Try to make a dihedral group with Python
Try creating a FizzBuzz problem with a shell program
Solving the Python knapsack problem with the greedy algorithm
I tried to solve the virtual machine placement optimization problem (simple version) with blueqat
A memo on how to overcome the difficult problem of capturing FX with AI
Try to create a waveform (audio spectrum) that moves according to the sound with python
The 15th offline real-time I tried to solve the problem of how to write with python
[Python] Try to read the cool answer to the FizzBuzz problem
Try to make a command standby tool with python
How to create a submenu with the [Blender] plugin
Try to solve the problems / problems of "Matrix Programmer" (Chapter 1)
Try to visualize the room with Raspberry Pi, part 1
I tried to solve the soma cube with python
[GCP] Try a sample to authenticate users with Firebase