Traveling Salesman Problem Solver

CS Fundamentals

Implement multiple algorithms to solve the Traveling Salesman Problem, demonstrating different approaches to finding optimal routes.

Mar 2023
University of Michigan

Objective

The objective of this project is to design and implement an efficient solution for the Traveling Salesman Problem (TSP) using an exact algorithm. The goal is to find the shortest possible route that visits a set of points exactly once and returns to the starting point, while ensuring computational feasibility for small to mid-sized datasets.

Key Achievements

  • Implemented three distinct TSP algorithms with different complexity-optimality trade-offs
  • Achieved optimal solutions for small instances using sophisticated branch-and-bound pruning
  • Developed efficient heuristic solutions for larger problem instances
  • Applied graph theory concepts including MST relationships and edge cost analysis
  • Optimized memory usage and computational efficiency for large-scale problems

Technologies Used

C++AlgorithmsData StructuresGraph TheoryDynamic ProgrammingBranch and BoundMinimum Spanning TreesHeuristics

Theory & Approach

The goal of this project was to solve the Traveling Salesman Problem (TSP) — finding the shortest possible tour that visits each point exactly once and returns to the start. Given the TSP's NP-hard nature, especially in larger instances, I combined exact methods with heuristic preprocessing for both correctness and performance.<br/> I implemented: - An exact solution using Branch-and-Bound with a lower bound based on Minimum Spanning Trees (MST). - A fast heuristic estimator (Arbitrary Insertion TSP) to provide an initial upper bound. - A priority-queue optimized Prim's MST algorithm for efficient lower-bound calculation. This hybrid approach allowed the solver to prune large parts of the search space and compute optimal solutions for reasonably sized inputs while maintaining numerical stability and modular design.

Technical Deep Dive

**Branch and Bound**: Recursively permutes the path, fixing the first node and exploring all valid tours. A partial tour is expanded only if its cost plus a lower-bound estimate is less than the current best cost. **Lower Bound Calculation**: - The algorithm computes a **partial tour cost**. - Then adds the **MST of unvisited nodes** for a tight admissible estimate. - Additional cost is added via shortest edges from the current path's head and tail to the unvisited set (`arm1 + arm2`). **Prim’s Algorithm (Optimized)**: Used to calculate the MST lower bound for the unvisited nodes efficiently using a priority queue (`O(n log n)`). **Initial Upper Bound**: A fast **arbitrary insertion heuristic** provides a tour close to optimal in linearithmic time, helping prune early branches. **Euclidean Distance Matrix**: All pairwise distances are precomputed and stored in a matrix for O(1) access during the algorithm.

Challenges & Solutions

**Efficient Pruning**: Initially, too many permutations were being explored. I resolved this by tightening the lower bound using MSTs and connection heuristics. **Floating Point Instability**: Summing many small distances led to slight rounding errors in deeper recursive branches. I addressed this by consistently using `double` precision and carefully managing when square roots were applied (e.g., only at total weight stages, not MST internals). **Insertion Heuristic Bias**: The naive greedy insertion sometimes produced high-cost tours. I improved it by testing different insertion points and preserving the best position at each step, reducing total bias. **Performance Bottlenecks**: In larger input sets (V > 11), MST computation became the bottleneck. I used a priority queue version of Prim’s algorithm to reduce unnecessary scans.

Review

**Branch-and-bound becomes powerful only with a good bounding strategy**. The MST + arm-edge trick was critical in reducing the exponential search space. **Hybrid algorithms beat pure ones in practice**. Using a heuristic to guide exact search strikes a strong balance between speed and correctness. **Algorithm design is as much about memory layout and access patterns** as it is about theoretical structure. Precomputing and caching distances gave significant speedups. **Clean structuring of code into modular functions (MST, bounding, permutation)** made debugging and optimization much easier later on.

Future Improvements

**Parallelization**: The permutation generation could be parallelized at the first few levels of recursion to leverage multi-core processors. **Better Heuristics**: Replace arbitrary insertion with **2-opt** or **Christofides algorithm** for tighter initial bounds and improved early pruning. **Visualization**: Adding graphical output (SVG or live canvas) would make debugging and interpreting the tour much easier for end-users. **Input Generalization**: Currently assumes Euclidean distances — it could be extended to work with road networks, weighted graphs, or real-world data. **Scalability**: For very large datasets (n > 50), integrate metaheuristics like Genetic Algorithms or Simulated Annealing as fallbacks.

Code Snippets

TSP Solver: Branch-and-Bound with MST Heuristic

These excerpts showcase the key components of an exact Traveling Salesman Problem (TSP) solver using Branch-and-Bound with MST-based pruning.

void prim_mst(std::vector<Coordinate_MST>& coordinates, double& weight_tot) {
    coordinates[0].distance = 0; // Start with first vertex

    std::priority_queue<std::pair<double, size_t>,
                        std::vector<std::pair<double, size_t>>,
                        Compare> min_heap;

    min_heap.emplace(0.0, 0); // Push the starting vertex

    while (!min_heap.empty()) {
        auto [curr_dist, curr_idx] = min_heap.top();
        min_heap.pop();

        if (coordinates[curr_idx].visited) continue;

        coordinates[curr_idx].visited = true;
        weight_tot += sqrt(curr_dist); // Accumulate MST weight (sqrt if using squared distance)

        // Try to update distances to unvisited neighbors
        for (size_t i = 0; i < coordinates.size(); i++) {
            if (!coordinates[i].visited) {
                double weight = euclidean_distance_MST(coordinates[curr_idx], coordinates[i]);
                if (coordinates[i].distance > weight && travel(coordinates[curr_idx], coordinates[i])) {
                    coordinates[i].distance = weight;
                    coordinates[i].prev = static_cast<long>(curr_idx);
                    min_heap.emplace(weight, i);
                }
            }
        }
    }
}

std::vector<size_t> arbitrary_insertion_tsp(std::vector<Coordinate_FAST>& coords, double& weight_tot) {
    std::vector<size_t> path;
    path.reserve(coords.size() + 1); // Reserve space to avoid reallocations

    // Start path with nodes 0 → 1
    path.push_back(0);
    weight_tot += euclidean_distance_FAST(coords[0], coords[1]);
    path.push_back(1);

    // Iteratively insert remaining nodes at the position with least increase in total distance
    for (size_t i = 2; i < coords.size(); ++i) {
        const auto& curr = coords[i];
        size_t best_pos = 0;
        double best_dist = std::numeric_limits<double>::max();

        // Try all insertion positions between existing path points
        for (size_t j = 0; j < path.size() - 1; ++j) {
            double dist1 = euclidean_distance_FAST(coords[path[j]], curr);
            double dist2 = euclidean_distance_FAST(curr, coords[path[j + 1]]);
            double dist3 = euclidean_distance_FAST(coords[path[j]], coords[path[j + 1]]);
            double delta = dist1 + dist2 - dist3; // Insertion cost

            if (delta < best_dist) {
                best_dist = delta;
                best_pos = j + 1;
            }
        }

        path.insert(std::next(path.begin(), best_pos), i);
        weight_tot += best_dist;
    }

    // Close the loop by returning to the start
    weight_tot += euclidean_distance_FAST(coords[path.back()], coords[path[0]]);
    path.push_back(path[0]);

    return path;
}

class OptTSP {
public:
    OptTSP(const std::vector<Point>& pts) : points(pts), V(pts.size()) {
        best_cost = std::numeric_limits<double>::max();
        path.resize(V);
        for (size_t i = 0; i < V; ++i) path[i] = i; // Initial path: [0, 1, 2, ..., V-1]
        compute_dist_matrix(); // Fill distance matrix
        estimate_upper_bound(); // Initial guess (linear path)
    }

    void solve() {
        branch_and_bound(1); // Start branch and bound from second position (first fixed at 0)
    }

    double get_cost() const { return best_cost; }
    const std::vector<size_t>& get_path() const { return best_path; }

private:
    std::vector<Point> points;
    std::vector<size_t> path, best_path;
    std::vector<std::vector<double>> dist_matrix;
    double best_cost;
    size_t V;

    // Precompute pairwise distances
    void compute_dist_matrix() {
        dist_matrix.resize(V, std::vector<double>(V));
        for (size_t i = 0; i < V; ++i)
            for (size_t j = 0; j < V; ++j)
                dist_matrix[i][j] = (i == j) ? 0 : dist(points[i], points[j]);
    }

    // Use naive path as an initial upper bound (e.g., 0 → 1 → ... → n-1 → 0)
    void estimate_upper_bound() {
        best_cost = 0;
        for (size_t i = 0; i < V - 1; ++i)
            best_cost += dist_matrix[i][i + 1];
        best_cost += dist_matrix[V - 1][0];
    }

    // Prune using MST + closest connections heuristic
    bool promising(size_t perm_len) {
        double partial = 0;
        for (size_t i = 0; i < perm_len - 1; ++i)
            partial += dist_matrix[path[i]][path[i + 1]];

        double mst_weight = mst(perm_len); // MST of unvisited
        double connect_start = min_edge(path[0], perm_len); // From start to unvisited
        double connect_end = min_edge(path[perm_len - 1], perm_len); // From current end to unvisited

        return partial + mst_weight + connect_start + connect_end < best_cost;
    }

    // Find shortest edge from a fixed node to unvisited nodes
    double min_edge(size_t from, size_t start) {
        double min_w = std::numeric_limits<double>::max();
        for (size_t i = start; i < V; ++i)
            min_w = std::min(min_w, dist_matrix[from][path[i]]);
        return min_w;
    }

    // Compute MST weight for unvisited nodes using Prim's algorithm
    double mst(size_t start) {
        std::vector<double> key(V, std::numeric_limits<double>::max());
        std::vector<bool> in_mst(V, false);
        key[path[start]] = 0;
        double total = 0;

        for (size_t count = start; count < V; ++count) {
            double min_k = std::numeric_limits<double>::max();
            size_t u = -1;

            // Select the minimum weight unvisited vertex
            for (size_t i = start; i < V; ++i) {
                if (!in_mst[path[i]] && key[path[i]] < min_k) {
                    min_k = key[path[i]];
                    u = path[i];
                }
            }

            in_mst[u] = true;
            total += key[u];

            // Update keys
            for (size_t i = start; i < V; ++i) {
                size_t v = path[i];
                if (!in_mst[v] && dist_matrix[u][v] < key[v])
                    key[v] = dist_matrix[u][v];
            }
        }
        return total;
    }

    // Recursive backtracking with branch-and-bound pruning
    void branch_and_bound(size_t perm_len) {
        if (perm_len == V) {
            // Full path constructed; compute tour cost
            double total = 0;
            for (size_t i = 0; i < V - 1; ++i)
                total += dist_matrix[path[i]][path[i + 1]];
            total += dist_matrix[path[V - 1]][path[0]]; // Return to start

            // Update best solution if better
            if (total < best_cost) {
                best_cost = total;
                best_path = path;
            }
            return;
        }

        if (!promising(perm_len)) return; // Prune

        // Generate permutations recursively
        for (size_t i = perm_len; i < V; ++i) {
            std::swap(path[perm_len], path[i]);
            branch_and_bound(perm_len + 1);
            std::swap(path[perm_len], path[i]); // Backtrack
        }
    }
};

TSP Algorithm Visualizer

TSP Algorithm Visualizer
💡 Tip: Use 5 cities to see the clearest difference between FASTTSP (heuristic) and OPTTSP (optimal) algorithms!

Algorithm Comparison

MST (Approximation)
• Builds tree, not tour
• Fast: O(n²)
• Not optimal
FASTTSP (Heuristic)
• Builds complete tour
• Medium: O(n²)
• Good approximation
OPTTSP (Optimal)
• Finds best solution
• Slow: O(n!)
• Guaranteed optimal
MST Approximation
Approximation

Builds a tree connecting all cities (not a complete tour)

O(n²)
FASTTSP
Heuristic

Greedy insertion builds complete tour step by step

O(n²)
OPTTSP
Optimal

Systematically explores all possible permutations

O(n!)