TSP Resolver

TSP Resolver is an operations research project that solves the Traveling Salesman Problem with real-world constraints, implements 8 metaheuristic algorithms, and provides statistical comparison of their performance across 24 instance sizes (N from 5 to 3000 nodes). It supports forbidden nodes, precedence constraints, and variable toll costs, with results validated on multiple random seeds.

Problem Statement

The TSP consists of finding the shortest route that visits a set of cities exactly once and returns to the starting point. This project extends the classical TSP with three real-world constraints: forbidden vertices (roads under construction), precedence constraints (delivery dependencies), and toll costs (highway fees). The goal is to minimize total cost = distance + tolls + penalties.

Architecture

The project is organized into three layers: data layer (graph generation and instance management), algorithm layer (8 metaheuristic implementations), and evaluation layer (cost calculation and constraint validation).

Graph generation produces random instances with configurable parameters: number of nodes (5-3000), forbidden vertex ratio (5%), precedence constraint ratio (15%), and toll edge ratio (20%). Each instance is validated to ensure feasibility before being passed to solvers.

My contribution: I implemented the Taboo Search family (both standard and 2-opt variants), Multi-start Simulated Annealing, and the statistical analysis framework for comparing algorithms across instance sizes.

Eight algorithms were implemented:

Nearest Neighbor (Plus Proche Voisin): Greedy construction heuristic, builds tour by always picking the closest unvisited node. Fast but greedy, does not explore alternatives.
Taboo Search with 2-opt: Local search with memory plus edge-swap improvement. Most scalable: gap drops from 250% (N=10) to 250% (N=1000) as instance size grows.
Taboo Search: Local search with memory, avoids revisiting recently explored solutions via a tabu list. Good quality but plateaus at 500% gap.
Simulated Annealing (Recuit Simulé): Stochastic relaxation, accepts worse solutions with declining probability. Peaks at 700% gap (N=200), then improves to 500%.
Multi-start Simulated Annealing: Runs SA multiple times with different seeds. Best scalability: gap drops from 250% to 150% by N=3000.
Genetic Algorithm (Algorithme Génétique): Evolutionary approach, populations evolve through selection, crossover, and mutation. Does not scale well: gap exceeds 2000% for N > 200.
Ant Colony (Colonie de Fourmis): Swarm intelligence algorithm, agents deposit pheromones on edges. Gap increases to 750% at N=200.
Multi-start Hill Climbing: Local improvement from multiple random starts. Peaks early at 700% gap (N=50), does not improve for larger instances.

Design patterns: Strategy for algorithm selection, Factory for instance generation, Observer for real-time progress tracking, Template Method for shared evaluation logic.

Comparative Analysis

The experiment runs across 24 instance sizes from N=5 to N=3000, with 5 random seeds per instance. Results show gap (percentage above lower bound) and execution time (seconds per instance). Total: approximately 2400 solutions evaluated per algorithm, 19000+ total solutions.

Algorithm	Gap at N=10 to 75	Gap at N=200 to 3000	Execution Time Range	Scalability
Nearest Neighbor	300-700%	700%+	0.00008s to 100s	Poor
Taboo Search 2-opt	250-550%	250% (at N=1000)	0.015s to 250s	Excellent
Taboo Search	250-500%	500% (plateau)	0.018s to 50s	Good
Simulated Annealing	250-700%	500% (at N=250)	0.08s to 70s	Moderate
Multi-start SA	250-650%	150% (at N=3000)	0.12s to 100s	Best
Genetic Algorithm	350% to 2000%+	2000%+	0.5s to 150s	Poor
Ant Colony	250-750%	750%+ (at N=200)	0.25s to 60s	Poor
Multi-start HC	350-700%	N/A (peaks at N=50)	0.0035s to 70s	Poor

Multi-start Simulated Annealing and Taboo Search with 2-opt are the only algorithms that improve relative to the lower bound as instance size grows. The genetic algorithm and ant colony do not scale beyond small instances.

What I Learned

Algorithm selection depends on instance size: For N smaller than 50, nearest neighbor is a reasonable baseline. For N larger than 100, only Multi-start SA and Taboo 2-opt improve.
Parameter tuning matters more than algorithm choice: Simulated annealing sensitivity to cooling schedule had more impact than switching algorithms.
Genetic algorithm needs better operators: The ordered crossover did not preserve good sub-tours. Would try partially mapped crossover (PMX) instead.
Multi-start is underused: Running N times with different seeds costs little but significantly reduces variance.

Key Tradeoffs

Quality vs Time: Taboo Search 2-opt achieves the best quality-to-time ratio for large instances (N > 100). Nearest neighbor is fastest but quality degrades rapidly.

Scalability: Only Multi-start SA and Taboo Search 2-opt show improving gap as N increases. All other algorithms degrade or plateau.

Memory: Ant Colony stores a full pheromone matrix (O(N squared) memory). Taboo Search maintains a tabu list that grows over time. For large instances, memory becomes a bottleneck.

Parameter Sensitivity: Simulated annealing cooling schedule dramatically affects results. Implemented adaptive cooling that adjusts based on instance size.

If I Redid This

Parallelize across cores: Ant colony and genetic algorithm are embarrassingly parallel — would use multiprocessing.Pool
Add branch-and-bound for ground truth: Small instances (N smaller than 15) need optimal solutions to validate algorithm quality
Auto-select algorithm: Train a simple model to predict best algorithm based on instance features (N, edge density, constraint ratios)
Add hybrid approach: Use nearest neighbor for initial solution, then refine with Taboo 2-opt

Limitations and Future Work

Exact solver not implemented: Branch and bound would guarantee optimal solutions for small instances, useful as ground truth for algorithm comparison.

No parallelization: All algorithms run single-threaded. Ant colony and genetic algorithm embarrassingly parallelize across multiple cores.

Static instances: Graphs are generated once before solving. Dynamic graph updates (traffic, road closures) would better model real-world logistics.

Hey , I'm Clément.