# Additional Dimensions to the Study of Funnels in Combinatorial Landscapes

The dataset contains landscape data for "Additional Dimensions to the Study of Funnels in Combinatorial Landscapes", G. Ochoa, N. Veerapen. The 2016 Genetic and Evolutionary Computation Conference (GECCO 2016), 20-24 July 2016, Denver, Colorado, USA. The dataset describes the network structure of the local optima networks for the eight Traveling Salesman Problem instances that are sampled in the paper.

The data are organised into eight folders, one for each of the eight instances analysed in the paper: att532, d657, gr666 and u574 (from TSPLIB, http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/) and C570.0, C670.0, E570.0 and E670.0 (included in the zip file and generated using the DIMACS TSP instance generator, http://dimacs.rutgers.edu/Challenges/TSP/download.html).

## How to interprete the data

The networks for each instance are described by five plain text files with the sols, nodes, edges, edge_history and node_history file name extensions. 

The letter and number pairs in the file names indicate the flags that were used to run the linkern program provide in Concorde (http://www.math.uwaterloo.ca/tsp/concorde.html). Flag s is the random seed used. R defines the number of kicks. r defines the number of runs. Flag z is a custom flag that defines how many double bridge kicks are applied in succession (in the paper this corresponds to parameter p).

Some flags are not included in the file names. The initial solution of each run is generated by alternating between a random solution and the quick Boruvka method for each run. This is equivalent to alternating between the existing 0 (random) and 4 (quick Boruvka) options for the I flag. A kick is computed using the random walk method and corresponds to flag K with value 3.

Only solutions that strictly improved the previous solution in the run were recorded and failed escape attempts were discarded.

The sols file contains the list of the solutions and each line corresponds to a unique solution. The file is split into two tab-delimited columns:
ID - an integer ID of the solution (same as in the nodes file)
SOLUTION - a comma separated representation of the solution where the cities are indexed according to their order of appearance in the TSPLIB instance starting from 0

The nodes file contains the list of the nodes in the local optima network and each line corresponds to a unique solution. The file is split into two tab-delimited columns:
ID - an integer ID of the solution in the network (same as in the sols file)
FITNESS - the fitness of the solution

The edges file contains the list of directed edges in the local optima network and each line corresponds to a unique edge. The file is split into three tab-delimited columns:
ID_START - the ID of the node starting the edge
ID_END - the ID of the node ending the edge
COUNT - the number of times this edge was observed during the sampling procedure

The edge_history file contains a summary of each iteration of the sampling procedure. The file is split into four tab-delimited columns:
RUN - the number of the run
ITER - the number of the iteration within the run
ID_START - the ID of the node starting the edge (the run)
ID_END - the ID of the node ending the edge (the run), the value is -1 if the solution does not have a strictly improved fitness
NUM_KICKS - the number of double bridge kicks used during this iteration

The node_history file contains a summary of each run in the sampling procedure. The file is split into at least four tab-delimited columns:
RUN - the number of the run
START - the ID of the first node in the run
GLOBAL_FOUND - 1 if at least one global optimum was found in the run, 0 otherwise
BEST_FITNESS - fitness of the best solution found in the run
NUM_BEST - number of solutions that have the best fitness found in the run
BEST_NODES - comma separated IDs of the nodes that have the best fitness found in the run