Skip to content

gjbex/mem_io

Repository files navigation

mem_io

Lots of I/O, really, really fast.

What is it?

Some HPC workloads are embarrassingly parallel in the sense that they consists of a large number of independent computations performed by coordinated, but non-interacting processes. Each of these processes performs I/O operations, potentially on a shared, parallel file system such as lustre or GPFS.

Many (smallish) independent write operations will results in a high load for the meta data servers, which may degrade performance for all jobs on an HPC cluster.

Although it is possible to tune parallel file systems for such workloads, this implies that other I/O patterns may suffer a performance degradation. Hence it would be useful to reduce the number of meta data IOPs as much as possible.

mem_io aims to do this, without imposing a code modification on the applications, provided they write to standard output. To achieve this, it uses a high performance in-memory database to store the output generated by the applications.

Requirements

  1. A C99 capable C compiler to build the applications.
  2. The m4 macro processor.
  3. The hiredis C library that implements an API to commnicate with a redis database (https://github.com/redis/hiredis).
  4. redis (http://redis.io/), an open source data structure server.

For installation instructions, see INSTALL.md.

Documentation

See http://mem-io.readthedocs.org/