pj_compensate
is an experimental tool to compensate for tracer
intrusion in pjdump
traces.
None (except libc)
make
This file is written in Emacs’ org-mode and guides the user
step-by-step through the complete usage of pj_compensate
, using the
example trace provided in this repository.
The FLAGS_OPT
variable below should be adjusted to match the compile
flags of the application that was traced. Execute the elisp code block
below before proceeding.
(setq FLAGS_OPT "-O2 -march=native ")
(setq FLAGS_WARN "-Wall -Wextra -Wpedantic -Wformat-security -Wshadow -Wconversion -Wfloat-equal ")
(setq FLAGS_STD "-std=c99 ")
(setq FLAGS_EXTRA "-D_POSIX_C_SOURCE=200112L")
(setq FLAGS_ALL (concat FLAGS_STD FLAGS_EXTRA FLAGS_OPT FLAGS_WARN))
First of all, given a Pajé trace file, we need to use pj_dump
to
convert it to pj_compensate
’s input format. Alternatively, given a
OTF trace file, there is a otf2pjdump
script available with
Akypuera.
We’ll be using the supplied example trace
example/simple.paje
pj_dump -u -n -l 15 "$pajetrace"
The user should benchmark his tracer as to get the mean execution time of the logging routine. This can be done by isolating the logging routine and taking a mean of the execution time, thought this presents various issues and should be done in a special manner. TODO: Link article explaining how to do this properly.
The user should benchmark message copy time in his architechture.
This section provides scripts to do that assuming messages are copied
between buffers using memcpy
. Notice that the results should be
written to a CSV file with a special format (see <a href=”Running the
benchmark”>Running the
benchmark).
TODO: Should special care be taken for this also?
GNU coreutils
Take the mean of this number of iterations as the measured value
30
grep "$trace" -e ^Link | cut -d',' -f11 | perl -ne 'print unless $seen{$_}++'
f=`tempfile`
for i in `seq $byteiters`; do echo $uniquebytes | perl -pe 's/ /\n/g' >>$f; done
shuf $f
rm $f
echo $uniquebytes | perl -pe 's/ /\n/g' | sort -h | tail -n 1
Just execute the following code block
#define TIMESPEC2SEC(s, e)\
(difftime((e).tv_sec, (s).tv_sec) + (double)((e).tv_nsec - (s).tv_nsec) * 1e-9)
printf("%d\n", bytes_rows);
char *buff = malloc((size_t)max);
if (!buff)
exit(EXIT_FAILURE);
struct timespec s, e;
double timer_overhead = 0;
for (int i = 0; i < bytes_rows; i++) {
clock_gettime(CLOCK_REALTIME, &s);
clock_gettime(CLOCK_REALTIME, &e);
timer_overhead += TIMESPEC2SEC(s, e);
}
timer_overhead /= (double)bytes_rows;
for (int i = 0; i < bytes_rows; i++) {
clock_gettime(CLOCK_REALTIME, &s);
memcpy(buff, buff + max, bytes[i][0]);
clock_gettime(CLOCK_REALTIME, &e);
double ans = TIMESPEC2SEC(s, e) - timer_overhead;
printf("%d %.15f\n", bytes[i][0], ans > 0 ? ans : 0);
}
free(buff);
Now that we have all the data and the input trace in the correct format, all we need to do is compensate it.
./pj_compensate --help
Usage: pj_compensate [OPTION...] ORIGINAL-TRACE COPYTIME-DATA OVERHEAD SYNC-BYTES Outputs a trace compensating for Aky's intrusion
:
-l, --lower Use a lower instead of upper bound for approximated communication times -?, --help Give this help list --usage Give a short usage message -v, --version Print version
Where messages > SYNC-BYTES should be treated as synchronous (for
instance with the SM BTL for OpenMPI 1.6.5, MPI_Send
is synchronous
if the message + header size is > 4096, header size being dependent
on the byte transfer layer (see here for more).
This sections describes the internals of pj_compensate
and is
intended for developers interested in modifying it.
The file one is interested in is src/compensation.c
. All others are
“means to an end”.
find -name '*.h' | xargs head -n 1
==> ./include/queue.h <== /* State and link queue implementations (see also events.h) */ ==> ./include/ref.h <== /* Simple reference counting data structure for embedding, for internal use. */ ==> ./include/prng.h <== /* pseudo ranodm double between 0 and 1, uniformally distributed */ ==> ./include/args.h <== /* Argument parsing */ ==> ./include/logging.h <== /* A simple logging macro and some wrappers */ ==> ./include/reader.h <== /* Routines to read binaries generated by Aky and structs to store the data */ ==> ./include/uthash.h <== /* The famous uthash macro lib */ ==> ./include/compensation.h <== /* Routines to compensate event timestamps */ ==> ./include/utlist.h <== /* The famous utlist macro lib */ ==> ./include/events.h <== /* Ref counted event structs (States and Links) and associated routines */
find -name '*.c' | grep -v example | xargs head -n 1
==> ./src/reader.c <== /* See the header file for contracts and more docs */ ==> ./src/pj_compensate.c <== /* Main application */ ==> ./src/queue.c <== /* See the header file for contracts and more docs */ ==> ./src/compensation.c <== /* See the header file for contracts and more docs */ ==> ./src/pj_dump_read.c <== /* Read a pj_dump trace file into the event queues */ ==> ./src/events.c <== /* See the header file for contracts and more docs */
There is currently no test suite. See TODO for metrics to compare results between compensation methods.
Send/Recv linking (see pj_compensate.c:link_sends_recvs
):