Please use this identifier to cite or link to this item: http://hdl.handle.net/11667/108
Full metadata record
DC FieldValueLanguage
dc.contributorBrownlee, Alexander-
dc.contributor.otherEPSRC - Engineering and Physical Sciences Research Councilen_GB
dc.creatorBrownlee, Alexander E I-
dc.creatorWoodward, John R-
dc.creatorVeerapen, Nadarajen-
dc.date.accessioned2018-04-05T07:39:26Z-
dc.date.available2018-04-05T07:39:26Z-
dc.date.created2017-12-
dc.identifier.urihttp://hdl.handle.net/11667/108-
dc.description.abstractAutomatic Design of Algorithms (ADA) shifts the burden of algorithm choice and design from developer to machine. Constructing an appropriate solver from a set of problem instances becomes a machine learning problem, with instances as training data. An efficient solver is trained for unseen problem instances with similar characteristics to those in the training set. However, this paper reveals that, as with classification and regression, for ADA not all training sets are equally valuable. We apply a typical genetic programming ADA approach for bin packing problems to several new and existing public benchmark sets. Algorithms trained on some sets are general and apply well to most others, whereas some training sets result in highly specialised algorithms that do not generalise. We relate these findings to features (simple metrics) of instances. Using instance sets with narrowly-distributed features for training results in highly specialised algorithms, whereas those with well-spread features result in very general algorithms. We show that variance in certain features has a strong correlation with the generality of the trained policies. Our results provide further grounding for recent work using features to predict algorithm performance, and show the suitability of particular instance sets for training in ADA for bin packing. This dataset includes the raw experimental data, new benchmark instances and figures for all experiments in the paper.en_GB
dc.description.tableofcontentsData sets and figures for the paper "Relating Training Instances to Automatic Design of Algorithms for Bin Packing via Features"; Alexander E.I. Brownlee, John R. Woodward, Nadarajen Veerapen; Companion Proceedings of GECCO 2018, Kyoto Japan. figures/ is the full set of figures for the instance sets and footprints stirling-instances/ contains the generated instances mentioned in the paper. Dir names are stir_generated_binCapacity_itemSizeLowerBound_itemSizeUpperBound. Filenames are stir_binCapacity_numberOfItems_itemSizeLowerBound_itemSizeUpperBound Format of files are the same as the Falkenauer benchmarks: line1:number of items line2:volume of bins any addition lines: item size data-wide/ contains the results of GP runs. There are three types of files in here: - waescher_instances.txt is just a list of the instance files in a given set (Waescher in this case) - waescher_performance_0.txt is the output from a GP run for the named instance set (Waescher here). Line 1 is the best GP program found. The remainder is tab-separated data for each bin packing instance in the set; instanceSet instanceName(usually automatically generated) numberOfBins(required when using the GP evolved packing policy) - waescher_29_on_stir_generated_150_76_100_performance.txt is the result when running the policy that was trained on waescher (run 29), on the instances in the stir_generated_150_76_100 set. content is same as line above. metricsAndRawBinCountsAll-allbp-trained-footprints - comma separated file with all data used for the analysis. Each row is the data for bin bin packing instance. Columns are: - format: the file format - set: the instance set the instance belongs to - id: a unique ID - file: filename containing the instance - numberInFile: some of the files have more than one instance. This is the index within the file. - isVolumesAreInts-ItemListCompressionRatio: the static features for the instance - ScaledFit0_raw-ScaledFit10_raw: the performance features for the instance (number of bins required by the scaled fit policies) - ScaledFit0_rank-ScaledFit10_dist: scaled variants of the performance features (unused in the paper) - 2cbp_0-stir_generated_150_76_100_29: number of bins required for this instance when using the policy trained on the named set, in GP run number x - 2cbp_footprintCount-stir_generated_150_76_100_footprintCount: number of times this instances appears in the footprint of policies trained on the names instance set. Full details are given in the README.txt file. Dedicated UnZip software is recommended for accessing the dataset, for example, IZArc.en_GB
dc.language.isoengen_GB
dc.publisherUniversity of Stirling. Faculty of Natural Sciences.en_GB
dc.relationBrownlee, AEI; Woodward, JR; Veerapen, N (2018): Data for the paper "Relating training instances to Automatic Design of Algorithms for bin packing via features". University of Stirling. Faculty of Natural Sciences. Dataset. http://hdl.handle.net/11667/108en_GB
dc.relation.isreferencedbyBrownlee, A.E.I, Woodward, J.R. and Veerapen, N. (2018) Relating Training Instances to Automatic Design of Algorithms for Bin Packing via Features. In: Proceedings of GECCO 2018. Genetic and Evolutionary Computation Conference 2018, 15.07.2018-19.07.2018. New York: ACM, pp. 135-136. DOI: https://doi.org/10.1145/3205651.3205748. Available from: http://hdl.handle.net/1893/26957 and http://hdl.handle.net/1893/27082en_GB
dc.rightsRights covered by the standard CC-BY 4.0 licence: https://creativecommons.org/licenses/by/4.0/en_GB
dc.subjectAutomatic design of algorithmsen_GB
dc.subjectfeaturesen_GB
dc.subjectbin packingen_GB
dc.subject.classification::Information and communication technologies::Artificial Intelligence Technologiesen_GB
dc.titleData for the paper "Relating training instances to Automatic Design of Algorithms for bin packing via features"en_GB
dc.typedataseten_GB
dc.contributor.emailalexander.brownlee@stir.ac.uken_GB
dc.identifier.rmsid1855en_GB
dc.identifier.rmsid1067en_GB
dc.identifier.projectidEP/N002849/1en_GB
dc.identifier.projectidEP/J017515/1en_GB
dc.title.projectFAIME: A Feature based Framework to Automatically Integrate and Improve Metaheuristics via Examplesen_GB
dc.title.projectDAASE: Dynamic Adaptive Automated Software Engineeringen_GB
dc.contributor.affiliationUniversity of Stirling (Computing Science - CSM Dept)en_GB
dc.contributor.affiliationQueen Mary University of Londonen_GB
dc.date.publicationyear2018en_GB
Appears in Collections:University of Stirling Research Data

Files in This Item:
File Description SizeFormat 
data.zip29.86 MBUnknownView/Open
Readme.txt2.46 kBTextView/Open


This item is protected by original copyright



Items in DataSTORRE are protected by copyright, with all rights reserved, unless otherwise indicated.