Please use this identifier to cite or link to this item: http://hdl.handle.net/11667/226
Appears in Collections:University of Stirling Research Data
Title: Data and processing scripts for the paper "Comparing Apples and Oranges? Investigating the Consistency of CPU and Memory Profiler Results Across Multiple Java Versions"
Creator(s): Brownlee, Alexander E I
Watkinson, Myles
Contact Email: alexander.brownlee@stir.ac.uk
Date Available: 7-Feb-2024
Citation: Brownlee, AEI; Watkinson, M (2024): Data and processing scripts for the paper "Comparing Apples and Oranges? Investigating the Consistency of CPU and Memory Profiler Results Across Multiple Java Versions". University of Stirling. Dataset, Image, Software. http://hdl.handle.net/11667/226
Publisher: University of Stirling
Dataset Description (Abstract): Data, code, processing scripts, and visualisations, for the paper "Comparing Apples and Oranges? Investigating the Consistency of CPU and Memory Profiler Results Across Multiple Java Versions", published in Automated Software Engineering journal. Results correspond to running different profilers under different java versions, for several open source projects.
Dataset Description (TOC): **Reproducing the results** Each project was run by cloning the relevant branch of Gin (as specified in the paper), and running the profiler in a shell script. For example, for disruptor (on Java 9): ``` export JAVA_HOME=/usr/java/jdk-9.0.4/ git clone git@github.com:gintool/gin.git cd gin ./gradlew clean build cd .. git clone git@github.com:LMAX-Exchange/disruptor.git cd disruptor git checkout 3.4.2 projectnameforgin='disruptor' for i in {1..20} do $JAVA_HOME/bin/java -Dtinylog.level=trace -cp ../../../gin-jdk9/gin/build/gin.jar gin.util.Profiler -r 1 -h ~/.sdkman/candidates/maven/current/ -p $projectnameforgin -d . -v 4.10.2 -o $projectnameforgin.Profiler_output_$i.csv &> $projectnameforgin.Profiler_stdoutstderr_$i.txt # or for memory: # $JAVA_HOME/bin/java -Dtinylog.level=trace -cp ../../../gin-jdk9/gin/build/gin.jar gin.util.MemoryProfiler -r 1 -h ~/.sdkman/candidates/maven/current/ -p $projectnameforgin -d . -v 4.10.2 -o $projectnameforgin.MemoryProfiler_output_$i.csv &> $projectnameforgin.MemoryProfiler_stdoutstderr_$i.txt done ``` *Note: gson and jcodec require the java target version in pom.xml amended to 1.7 rather than 1.6. For opennlp, the argLine elements must be removed from pom.xml* *exports of each branch are also included, under gin-hprof, gin-jfr8, gin-jfr9, and gin-jfr17 respectively. **Results** results/projectname* contains the CSVs generated by the relevant profiler for a given project. These are organised into directories for each profiler+JDK+memory/CPU combination. e.g. disruptorHPROFMemory is the profiling results for "disruptor" when running on HPROF, profiling memory. gsonJFR17CPU contains the results for gson, running JFR on JDK17, profiling CPU use. results/scripts/ contains the scripts used to process the results and generate the figures. - Compute_Stats.py contains the implementation of WRBO, and computes most of the stats and plots in the paper - run runForAll.sh from the directory where you want your figures to be generated. It will call Compute_Stats as needed. - Compute_History.py generates the heatmap plots - run runForAllHistories.sh from the directory where you want your figures to be generated. It will call Compute_History as needed. **All Figures and Stats** results/figures/ contains all the plots from the experiments, and text files containing the raw statistics. The conventions we've used in naming the figures and output files are as follows: The *.txt files are the statistics for each comparison. These are named projectname_type_profilers.txt, so junit4_cpu_jfr8vsjfr17.txt is the comparison of CPU profiling on JFR8 vs CPU profiling on JFR17 for Junit4. Types are "cpu", "memory" and "cpuvsmem". These files start with a list of counts of methods Table 3 in the paper; then WRBO (referred to as RBO in the files) for the repeat runs of each profiler (tables 4 and 7) and WRBO for comparisons between the two (Tables 5, 8, and 10). Then follows the Spearman correlations (Tables 6, 9, and 11). Finally there are summary stats for the test appearances All plots follow this naming convention: project_plottype_plotname_profiler(s)_{CPU,MEM} and are provided as eps and png files. The plottype_plotname combinations (with the Figure number in the paper that was drawn from them) are: Fig 3/8: - histogram_methodCounts - frequency of individual method appearances across all repeat runs of profiler Fig 4: - heatmap_methodRanks - the heatmap showing how method ranks change over the repeat runs () Fig 5: - histogram_rboWithinRepeatsForUnionMethods - distribution of rbo values (comparing method rankings) computed among all pairs of repeat runs of profiler - histogram_rboWithinRepeatsForIntersectionMethods - as above, but with rankings filtered to only methods appearing in every repeat run of profiler - histogram_rboWithinRepeatsForMedianMethodsHPROF - as above, but with rankings filtered to only methods appearing in at least half of the repeats run of profiler Fig 6/9/10: - scatter_meanRanksIntersection - mean ranks of methods appearing in all runs of first profiler vs their rank according to second profiler - scatter_meanRanksMedian - as above, but mean ranks of methods appearing in at least half of all runs of first profiler - scatter_top10MeanRanksMedian - as above but limited to the top 10 (might be fewer than 10 points because not all methods ranked by first profiler were also ranked by second) - scatter_top30MeanRanksMedian - as above but limited to the top 30 Fig 7: - histogram_testCountsForMethodX - frequency of individual test appearances among tests identified for method ranked X Not appearing in the paper; histograms for the data in Tables 5, 8, amd 10: - histogram_rboComparingProfilersForUnionMethods - distribution of rbo values (comparing method rankings) computed among all pairs of repeat runs of profiler1 and profiler2 - histogram_rboComparingProfilersForIntersectionMethods - as above, but with rankings filtered to only methods appearing in every repeat run of profiler - histogram_rboComparingProfilersForMedianMethods - as above, but with rankings filtered to only methods appearing in at least half of the repeats run of profiler For example, disruptor_histogram_rboComparingProfilersForIntersectionMethods_JFR9_CPUvsJFR9_Memory is the plot for disruptor, showing the distribution of WRBO values when comparing repeat runs of the JFR9 CPU profiler and the JFR9 memory profiler, with results filtered to intersection methods (i.e., those appearing in all repeat runs).
Type: dataset
image
software
Funder(s): University of Stirling
URI: http://hdl.handle.net/11667/226
Rights: Rights covered by the standard CC-BY 4.0 licence: https://creativecommons.org/licenses/by/4.0/
Affiliation(s) of Dataset Creator(s): University of Stirling (Computing Science - CSM Dept)
University of Adelaide

Files in This Item:
File Description SizeFormat 
replication_package.zip348.93 MBZIPView/Open


This item is protected by original copyright



Items in DataSTORRE are protected by copyright, with all rights reserved, unless otherwise indicated.