Welcome to NumIO ensembles!

ensembles is a helper for the NumIO benchmarking tool. It aims to be quick to learn and easy to use as well as well as providing a set of standardized tests.

When running applications on a cluster computing environment there are often other jobs applications running in the background. ensembles replicates an environment like this using background daemons. These generate background noise that emulate things like grep or a file download.

Features

This is a list of currently available features in NumIO ensembles. If you're not sure about what it can do check these:

  • friendly command line interface
  • install by downloading the single file release build
  • add easy to customize background workloads to your benchmarks to simulate real cluster computing environments
  • different modes allow standardized tests, which can still be customized to better fit your needs

Setup

dependencies

the script needs srun and NumIO installed. make sure you have these first!

the binary

ensembles itself is conveniently bundled into a single executable file. Just download the latest release from github and you will be ready to run it on your system.

Background Daemons

Types of Deamon

Besides the main Benchmark NumIO ensembles also spawns some daemons to create background noise. These are used to create a more realistic environment for to run the benchmark in. There are a few premade daemons:

Chatty

This is a daemon that creates a lot of network usage. Usually a single program not use this much network.

It is still useful to simulate a high combined network load.

Disk

This daemon creates a lot of disk I/O. It can be used to simulate something like a grep task.

CPU

This daemon creates a lot of load on the CPU. You can use it to simulate a compute task or similar.

Daemons In Advanced Mode

In advanced mode you will find the custom subcommand. This command lets you decide how many of the different daemons you want to run in the background. Use this feature to simulate a workload that is not standardized, but realistic for your system.

Running

you can run ensembles by starting the script. get a list of subcommands by running ensembles --help. you can also get help for a subcommand by running ensembles ${subcommand} --help. for example if you want help with the advanced command then you can run ensembles advanced --help for the custom subcommand of advanced you can type ensembles ${subcommand} advanced custom --help.

simple mode

This most basic run looks like ensembles simple. It will run the standard benchmark.

In most cases you will have to specify the path that NumIO is installed in ensembles --numio-path /home/${yn}/IO-partdiff/numio-posix simple. If you have numio installed globally you can skip this.

advanced mode

Ensembles also has an advanced mode. This mode provides some more fine grained control.

empty

A benchmark with little background noise.

balanced

A benchmark with a medium amount of background noise.

peak

A benchmark with a lot of background noise.

custom

A benchmark with customizable background daemons. You will have to specifiy the number of background daemons for this one.

Development

You want to work on this codebase? Great! Just make sure you follow the black coding style.

The next step will guide you through the project structure.

Setting up your Environment

If you want to work on this program the setup is different from just running it, which should ideally only require an executable file.

Get python working

There is a good chance this is already working on your system. Check if either python --version or pip --version return an error. If these work you can skip this step.

If you have access to Spack you get one by running spack load -first python. Else you can also use your package manager of choice.

Setting up the venv

It is generally not recommended to install python dependencies globally since this might mess up your environment.

python -m venv /path/to/new/virtual/environment

Dependencies

NumIO ensembles uses a few python libraries.

python 3.9

Numio ensembles is currently built against python 3.9, but later versions should work too.

typer

NumIO ensembles uses typer for parsing cli args and pretty printing help pages etc.

rich

rich is used for pretty printing the output. it's also used in the background by typer.

typing extensions

typing extensions provide better type hints.

python dateutil

python dateutil is used to convert times to human readable formats.

pyinstaller

PyInstaller is used to generate the executable folder.

Project Structure

The project structure for NumIO ensembles is not very complicated and you should be able to get started easily.

main.py

This serves as the entrypoint to the application.

When developing it is recommended to start the program like so: ./src/main.py [args]

advanced.py

This file contains definitions for the advanced command line mode containing fine grained configurations.

batch.py

This file can launch batch jobs.

config.py

This handles general config like logging.

daemon.py

The premade background daemons are stored in this file.

global_vars.py

This stores all global variables.

mpirun.py

This provides classes to interact with srun and mpirun.

numio.py

This provides classes to interact with numio.

pretty_print.py

This provides classes to display more complicated things on the command line.

Building the docs

The docs are written in mdBook. Check their docs if you want to edit them. A flake.nix and justfile are provided in the git repo. If you have nix with flakes set up you can edit the docs by running nix develop. This will make all dependencies available as long as you are in the shell it creates. You can run just -l to find all relevant commands for development.