Distributed graph benchmark. Inspired by Graph500.
DGraphMark helps to test parallel computer (multicore, cluster - anything, that supports MPI) in terms of solving data-intensive tasks.
For now DGraphMark solves only BFS (breadth-first search) task, but it can be extensed to solve any task, which can be solved on distibuted graph.
BFS task was choosen to implement first, because it is a kernel algorithm of Graph500 -- famous data-intensive benchmark.
What's difference? Why to use DGraphMark?
- Different project model. Much more extensible and transparent.
- Different implementation of algorithms. You are not bound to use only one.
- Performance risen significantly. Validation of BFS result is almost as quick as BFS running.
- Lots of build parameters to provide full control.
- It is possible to run several benchmarks or tasks in one run to create complex benchmark builds.
- Results and statistics rates of run are saved to file in .properties format. It can be read well by both machines and humans!
- Self-documented well-formatted code.
Tested in:
- Ubuntu-like Linux, GCC 4.6+, MPICH 3.*+
What is done now:
- creation and distribution of graph
- generation of oriented graph with random (4.1) algorithms: uniform and Kronecker 2x2 (R-MAT);
- optional deorientation edges with MPI point-to-point communication;
- creating Compressed Sparse Row Graph from initial list of edges;
- search task (consists of creating a tree in graph, starting from root vertex)
- creating root verticies with random (4.1) function;
- available BFS tasks:
- dgmark_p2p -- MPI P2P with lock;
- dgmark_rma -- MPI RMA with fetch locking;
- dgmark_p2p_nolock -- MPI P2P without locking. Buffered senders;
- Graph500 RMA. Uses bfs_run function from mpi/bfs_onesided.c (v2.1.4). Refactored and optimized a bit;
- Graph500 P2P. Uses bfs_run function from mpi/bfs_simple.c (v2.1.4);
- BFS result is distributed array of global parents for each local vertex;
- validation of built tree:
- ranges validation : make sure, parents are in legal values (global vertex or UNREACHED);
- parents validation: make sure, that no vertex is parent of itselt (except for root);
- depth validation: make sure, that all visited vertices has valid depth;
- depth building algorithms (also they check for loops in parent array):
- buffered -- buffered sending of visited local vertices depth (if was not sended before), while there is anything to send.
- p2p_noblock -- try to retrieve depth of parent for all visited parents, while it is possible.
- generation of statistics:
- statistics consists of initial data, duration of some processes and some math statistics about:
- bfs duration;
- validation duration;
- traversed edges count;
- mark value;
- math statistics parameters:
- mean (arithmetic);
- standard deviation;
- relative standard deviation (to mean);
- minimum;
- first quartile;
- median;
- trird quartile;
- maximum;
- all statistics prints on screen and writes to file "./dgmarkStatistics/dgmark_stat_YYYY-MM-DDThh-mm-ss.properties" in machine-readable format;
- random numbers generation
- implemented simple method of generation. Results of rand() (from cstdib) are concatinated to provide 64 bit random value;
- seed is generated separately for each node (based on it's rank);
- this method is used, because it provides compability with compilers without c++11 functions. ()
What to be done:
- test work on several machines with different architectures and realizations of MPI to find problems.