Skip to content

Running Benchmarks

Christian Menard edited this page Dec 17, 2020 · 15 revisions

Running Benchmarks

The LF repository contains a series of benchmarks in the benchmark directory. There is also a flexible benchmark runner that automates the process of running benchmarks for various settings and collecting results from those benchmarks. It is located in benchmark/runner. The runner is written in python and is based on hydra, a tool for dynamically creating hierarchical configurations by composition

Prerequisites

Install Python dependencies

The benchmark runner is written in Python and requires a working Python3 installation. It also requires a few python packages to be installed. Namely, hydra-core, cog and pandas.

It is recommended to install the dependencies and execute the benchmark runner in a virtual environment. For instance, this can be setup with virtualenv:

virtualenv ~/virtualenvs/lfrunner -p python3
source ~/virtualenvs/lfrunner/bin/activate

Then the dependencies can be installed by running:

pip install -r benchmark/runner/requirements.txt

Compile lfc

For running LF benchmarks, the commandline compiler lfc needs to be built. Simply run

bin/build-lfc

in the root directory of the LF repository.

Also, the environment variable LF_PATH needs to be set and point to the location of the LF repository.

export LF_PATH=/path/to/lf

Setup Savina

For running Akka benchmarks from the original Savina benchmark suite, it needs to be downloaded and compiled. Note that we require a modified version here that expects a parameter specifying the number of worker threads.

git clone https://github.com/tud-ccc/savina.git
cd savina
mvn install

Building Savina requires a Java 8 JDK. Depending on the local setup, JAVA_HOME might need to be adjusted before running mvn in order to point to the correct JDK.

export JAVA_HOME=/path/to/jdk8

Also, the environment variable SAVINA_PATH needs to be set and point to the location of the savina repository.

export SAVINA_PATH=/path/to/savina

Running a benchmark

A benchmark can simply be run by specifying a benchmark and a target. For instance

cd benchmark/runner
./run_benchmark.py benchmark=savina/micro/pingpong target=lf-c

runs the Ping Pong benchmark from the Savina suite using the C-target of LF. Currently, supported targets are lf-c, lf-cpp, and akka, where akka corresponds to the Akka implementation in the original Savina suite.

The benchmarks can also be configured. The threads and iterations parameters apply to every benchmark and specify the number of worker threads as well as how many times the benchmark should be run. Most benchmarks allow additional parameters. For instance, the Ping Pong benchmark sends a configurable number of pings that be set via the benchmark.params.messages configuration key. Running the Akka version of the Ping Pong benchmark for 1000 messages, 1 thread and 12 iterations could be done like this:

./run_benchmark.py benchmark=savina/micro/pingpong target=akka threads=1 iterations=12 benchmark.params.messages=1000

Each benchmark run produces an output directory in the scheme outputs/<date>/<time>/ (e.g. outputs/2020-12-17/16-46-16/). This directory contains a files results.csv which contains the measured execution time for each iteration and all the parameters used for running this particular benchmark. The csv file contains precisely one row per iteration.

Running a series of benchmarks (multirun)

The runner also allows to automatically run a single benchmark or a series of benchmarks with a range of settings. The multirun feature is simply used by the -m switch. For instance:

./run_benchmark.py benchmark=savina/micro/pingpong target="glob(*)" threads=1,2,4 iterations=12 benchmark.params.messages="range(1000000,10000000,1000000)"

runs the Ping Pong benchmark for all targets using 1, 2 and 4 threads and for a number of messages ranging from 1M to 10M (in 1M steps).

This mechanism can also be used to run multiple benchmarks. For instance,

./run_benchmark.py benchmark="glob(*)" target="glob(*)" threads=4 iterations=12

runs all benchmarks for all targets using 4 threads and 12 iterations.

The results for a multirun are written to a directory in the scheme multirun/<date>/<time>/<n> (e.g. multirun/2020-12-17/17-11-03/0/) where <n> denotes the particular run. Each of the <n> subdirectories contains a results.csv for this particular run.

Collecting results from multirun

A second script called collect_results.py provides a convenient way for collecting results from a multirun and merging them into a single CSV file. Simply running

./collect_results.py multirun/<date>/<time>/ out.csv

collects all results from the particular multirun and stores the merged data structure in out.csv. collect_results.py not only merges the results, but it also calculates minimum. maximum and median execution time for each individual run. The resulting CSV does not contain the measured values of individual iterations anymore and only contains a single row per run. This behavior can be disabled with the --raw command line flag. With the flag set, the results from all runs are merged as say are and contain the data for individual runs, but now minimum, maximum and median values.

Adding new benchmarks

TODO