Claus

9 minute read

It is time machines learned something and helped humanity - don’t you think? There are ways to teach them though. We’ll be talking about:

  • Genetic Algorithms
  • The travelling salesman problem and how to solve it
  • PMX - the partially mapped crossover

Engineers traditionally don’t understand how salespeople operate - even though we share a fundamental problem: the travelling salesman problem.

The Travelling Salesman

Salesmen have always been on the road it seems - long enough for this fairly old (mathematical) problem to exist. It all starts with a simple premise - which is highly computationally expensive: “Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city and returns to the origin city?” (from Wikipedia).

With a map you can figure this out in minutes, or even seconds - right? Computers are not as smart though, and mathematically speaking, in order to find the best possible solution one would have to evaluate every distance between every city - effectively having to run the distance calculation from each city to each (remaining) city. This means city * city number of calculations after each stop on the way.

Therefore the computational complexity, a measurement for how expensive an algorithm is by considering the “main growth factor” (in this case cities), is O(n!). These are the number of distance calculations required:

# of calculations
1 1
2 2
3 6
4 24
5 120
6 720
7 5040
8 40320
9 362880
10 3628800
11 39916800
12 479001600
13 6227020800
14 87178291200
15 1307674368000

It would take about 42 years if each calculation took a millisecond, clearly too long for only 15 cities! The solution is doing fewer calculations per city and metaheuristics are a group of algorithms that let you skip the majority of calculations, intelligently.

The problem itself is considered an NP hard problem and can be an abstraction for the likes of campaign planning, data processing order, and vehicle or graph routing. They all can have their own special cases like missing or bad links between cities, one-way streets, etc. making them very interesting problems to solve.

Best Mates

Genetic Algorithms are a problem agnostic approach to randomly but systematically create solutions, which are evaluated using the fitness function (see below). As previously explained, by minimizing the cost of a tour with a creation-evaluation-selection cycle Genetic Algorithms create good solutions.

Genetic Code

For the permutation to work, the representation needs to be efficient so the thousands of copies that will be created work as quickly as possible. Hence the genome (representing a tour) is an array of indices for a global “city” array: [1, 2, 3, 4, 5] would mean to visit city 1, 2, 3, 4, and 5 in exactly this order with the underlying cities being this dictionary/object of tuples { 1: (1, 1), 2: (2, 2), 3: (3, 3), 4: (4, 4), 5: (5, 5) }. Each genome is evaluated by the number the fitness function generates.

Fitness

The fitness is evaluated on the sum of the distances in a tour, calculated between each city and the next.


fn distance(p1: &City, p2: &City) -> f32 {
    ((p1.0 - p2.0).powi(2) + (p1.1 - p2.1).powi(2)).sqrt()
}

This is a simplistic assumption of the absolute shortest distance between two cities (aka the Euclidean Distance) and should not be relied upon in a real world. Reality is much more complex (unless you have a helicopter).

PMX - Partially Mapped Crossover

Although the TSP is a basic problem, it requires a different crossover operator. Any simple 1- or 2-point crossover produces duplicates - this is not an option for our salesman. Therefore an ordered, permutating crossover is required, like PMX crossover operator.

Since the goal is to create non-repeating permutations, so two tours [1, 2, 3, 4, 5] and [2, 3, 4, 5, 1] cannot produce the offspring [1, 2, 3, 5, 1] (1 point crossover). PMX solves that by choosing two crossover points at random (x1 and x2) that mark the area of what should be inherited. Any values within that area are taken from one parent, the rest from the other. This produces duplicates - just like a 2-point-crossover would.

In order to remove those duplicates, replacement is a good option. But what is the replacement? PMX uses a “map” to find these replacements by adding all values from within the crossover zone as keys and what they replaced as values.

When scanning for duplicates (outside the crossover zone) we can now check the “map” and see what value should be there - repeating until there is no more matching key. I realize that this is complex, so check out this video as well:

… or go all in on the original paper 😁.

Implementing PMX In Rust & JavaScript

Enough of the abstract concepts, let’s focus on explaining the algorithm in simpler terms, instead of highly academic papers and more or less readable code, I present you with a finished solution and walk you through it. This should only take a couple of minutes out of your day and you can optimize it further afterwards.

JavaScript

First, let’s talk about JavaScript and a framework called genometrics. This framework is quite simple and therefore easy to use, which was just what I needed. Perfect for implementing an understandable version of the PMX crossover implementation.

This version produces two offspring, hence there are two maps required. First these two maps are declared.

function x(s, t) {
  // create two maps, not necessary, but simpler
  let _map1 = {};
  let _map2 = {};

Next, these maps have to be filled, but only after the two crossover points have been chosen. As mentioned before, PMX places values from a parent into an offspring. The exchanged alleles are between two points x1 and x2.

  // choose two crossover points.
  const x1 = Math.floor(Math.random() * (s.length - 1));
  const x2 = x1 + Math.floor(Math.random() * (s.length - x1));

Afterwards the offspring arrays are created and the values set. Remember, s and t are parent 1 and 2.

  let offspring = [Array.from(s), Array.from(t)];

  for (let i = x1; i < x2; i++) {
    offspring[0][i] = t[i];
    _map1[t[i]] = s[i];

    offspring[1][i] = s[i];
    _map2[s[i]] = t[i];
  }

After this part, a child might look like that [1, 1, 4, 5, 6, 2] - let’s fix that. The maps (as previously defined) create a path for replacing the duplicates. Starting with the first part (0 -> x1) …

  for (let i = 0; i < x1; i++) {
    while (offspring[0][i] in _map1) {
      offspring[0][i] = _map1[offspring[0][i]];
    }
    while (offspring[1][i] in _map2) {
      offspring[1][i] = _map2[offspring[1][i]];
    }
  }

… and the second part (x2 -> end). Again, this is done for both offsprings.

  for (let i = x2; i < s.length; i++) {
    while (offspring[0][i] in _map1) {
      offspring[0][i] = _map1[offspring[0][i]];
    }
    while (offspring[1][i] in _map2) {
      offspring[1][i] = _map2[offspring[1][i]];
    }
  }
  return offspring;
}

Afterwards it’s all tied together by a calling function. pop is genometrics’ representation of a population. In this case, parents are two succeeding population members, and they produce two offspring each.

function PMXCrossover(pop) {
  const p = pop.members;
  const rounds = p.length % 2 ? p.length - 1 : p.length;
  for (let i = 1; i < rounds; i += 2) {

    // send two parents (s & t) into the crossover
    const s = p[i].genome;
    const t = p[i - 1].genome;
    
    x(s, t).forEach((offspring) => {
      pop.members.push(new Citizen(offspring));
    });
  }
}

Rust

Now for my favorite language, Rust. The basic mechanism is the same - but two parents produce only a single offspring 😔. The framework that I am using here is called RsGenetic, a great and easy to use crate.

After declaring the crossover function, I - for academic reasons, I guess - renamed the parents into s and t.

fn crossover(&self, other: &TspTour) -> TspTour {
    // PMX crossover
    let s = &self.tour;   
    let t = &other.tour;

Quick detour: rand doesn’t support a fast cross-platform WASM as of now, which is why I modified RsGenetic to accept a custom random generator in a RefCell in order to be able to use XorShiftRng, a pseudo-random number generator, with WASM.

    let mut rng = self.rng_cell.borrow_mut();

From there, the x1 and x2 are generated, which are the crossover points for the offspring. Additionally, the mapping is initialized as a vector since indices are numbers and a vector is a more efficient version of a HashMap in this case.


    // The crossover points
    //   x1 |     | x2 (excl)
    // [ 1, 2, 3, 4, 5]
    let x1 = rng.gen_range(0, s.len() - 1);
    let x2 = rng.gen_range(x1, s.len());

    let mut offspring = vec![0; s.len()];

    // Thanks to an index-based genome we can use a vector here.
    // Use a HashMap when the genome is a string or object 
    let mut mapping: Vec<Option<usize>> = vec![None; s.len()];

Then the offspring’s inherited DNA is added, and - as above - the map filled and later applied. By using Rust’s Option type we can keep walking the map until there is nothing left in the map.

    // Inheritance! This sets the values within the crossover zone
    for i in x1..x2 {
        offspring[i] = t[i];

        // Put the values that "should have been there" into the map.
        mapping[t[i]] = Some(s[i]);
    }

    // Scan for duplicates left of the crossover zone
    for i in 0..x1 {
        let mut o = mapping[s[i]];
        let mut last = None;

        // Keep iterating until there is no more mapping
        while o.is_some() {
            last = o;
            o = mapping[o.unwrap()];
        }
        offspring[i] = if let Some(v) = last { v } else { s[i] };
    }

The same goes for the part after the crossover section.

    // Scan for duplicates right of the crossover zone
    for i in x2..s.len() {
        let mut o = mapping[s[i]];
        let mut last = None;

        // same as above
        while o.is_some() {
            last = o;
            o = mapping[o.unwrap()];
        }
        offspring[i] = if let Some(v) = last { v } else { s[i] };
    }

    TspTour {
        tour: offspring,
        cities: self.cities.clone(),
    }
}

This is it! PMX isn’t too complicated but when I searched for it there was no proper explanation. This is why I hope I could save you some time on reading up on the specifics and you can download the entire code for Rust and JavaScript.

Thank you for your time!

This is another part of a larger theme that continues the WASM on Azure Functions series. Check out the other parts and follow the blog’s RSS feed and receive updates directly.

Never miss any of my posts and follow me on Twitter, or even better, add the RSS feed to your reader!