Skip to content

EMSL-MSC/xltop

 
 

Repository files navigation

xltop is a continuous Lustre monitor with batch scheduler intergation.

The organization of xltop is shown below and consists of a single
master process (xltop-master), an ncurses based client (xltop),
process(es) to push batch scheduler job mappings (xltop-clusd), and for
each MDS/OSS a daemon (xltop-servd) to monitor Lustre export stats via
/proc.

                     xltop-master
                     /    |    \
                    /     |     \
                   /      |      \
              xltop       |      xltop-servd
        (ncurses client)  |  (MDS/OSS monitor daemon)
                          |
                     xltop-clusd
                (job mapping daemon)

xltop-master maintains a hierarchy of flows between consumers (hosts,
jobs, clusters, the universe) and providers (targets (TODO), servers,
filesystems, and the universe).

The design of xltop assumes that:
 0) hosts never run more than one job at a time,
 1) file systems may be mounted by multiple clusters,
 2) hosts may belong to multiple lnets,
 3) two servers may use different NIDs for the same host,
 4) the same lnet name (for example o2ib) may be used multiple times,

Client Usage: xltop [OPTIONS]... [EXPRESSION...]
Expression may be one of:
 owner=OWNER, clus[=CLUS], job[=JOB], host[=HOST], fs[=FS], serv[=SERV]
Types may be given by their first character; use 'u' and 'v' for the universes.

OPTIONS:
 -c, --conf-dir=DIR          read configuration from DIR
 -f, --full-names            show full host, job, and server names
 -h, --help                  display this help and exit
 -i, --interval=SECONDS      update every SECONDS sceonds
 -k, --key=KEY1[,KEY2...]    sort results by KEY1,...
 -l, --limit=NUM             limit responses to NUM results
 -p, --remote-port=PORT      connect to master at PORT
 -r, --remote-host=HOST      connect to master on HOST
 -s, --show-sum              show sums rather than rates
 -u, --ubuntu                look snazzy on my terminal (terrible on xterms)

SORTING:
 Sort keys are case insensitive.  Keys containing 'W' select bytes written
 (as a rate or sum according to use of -s, --show-sum).  Similarly keys
 containing 'R' and 'Q' are interpreted as bytes read and requests.

EXAMPLES:
 xltop job serv (or xltop j s) # Show traffic between jobs and servers
 xltop h=i101-101.ranger.tacc.utexas.edu # Show i101-101 to each filesystem
 xltop o=kbdcat h s # Show hosts in kbdcat's jobs to /scratch servers
 xltop h fs=ranger-scratch s # All hosts to /scratch servers
 xltop h s=oss23.ranger.tacc.utexas.edu # All hosts to oss23
 xltop j=2411369@ranger s # Job 2411369 to all servers

Client screenshot:
    FILESYSTEM      MDS/T  LOAD1  LOAD5 LOAD15  TASKS   OSS/T  LOAD1  LOAD5 LOAD15  TASKS  NIDS
    ranger-work       1/1   1.52   3.48   4.41    609   14/84   2.74   2.08   2.09   1347  4212
    ranger-scratch    1/1   0.13   0.20   0.54    584  50/300   2.52   1.94   1.52   1348  4213
    ranger-share      1/1   0.93   1.20   1.72    544    6/36   3.55   1.37   0.90   1203  3960
    JOB      FS                  WR_MB/S     RD_MB/S      REQS/S  OWNER     NAME        HOSTS
    2526717  ranger-scratch      321.557       5.994    3556.133  tg803155  NST3.28-r0     20
    login4   ranger-scratch       38.489      55.054     469.943  NONE      NONE            1
    2530927  ranger-scratch       16.526       0.000      39.942  dkcira    Parametric      1
    2529449  ranger-work          11.754       0.000      24.088  bealing   PE-OH           4
    2530975  ranger-work          11.108       0.007      23.620  vishnam2  batch          16
    2529064  ranger-scratch       10.823       0.091      30.231  jc64248   runwrf         12
    2530142  ranger-scratch        9.418       0.948      49.386  subin     hnbo           16
    2530928  ranger-scratch        7.991       0.000      18.092  dkcira    Parametric      1
    2530142  ranger-work           7.066       8.501     410.608  subin     hnbo           16
    2528099  ranger-scratch        6.775       0.000      69.664  dvt5076   wrf_nolan      32
    2529489  ranger-work           6.166       0.000      13.467  yhe       tenWcross_     32
    2529788  ranger-work           6.116       0.000      13.332  yhe       tenWcross_     32
    2528098  ranger-scratch        6.022       0.000      67.693  dvt5076   wrf_nolan      32
    2530176  ranger-scratch        5.972       0.000      33.241  tg458470  d2o_mol_32      8
    2530055  ranger-scratch        5.215       0.000      18.108  yping     bivkp           9
    2529790  ranger-work           5.125       0.000      12.169  yhe       tenWcross_     32
    2529920  ranger-work           5.085       0.009      12.057  hli       iascpl         16
    2531057  ranger-work           3.835       0.001       8.347  tibor     tt_thermo_     60
    2530526  ranger-scratch        2.572       0.000      20.855  pchen     mguadtpc       61
    2530179  ranger-scratch        2.554       0.000      24.010  tg458470  d2o_mol_32      8
    2526200  ranger-scratch        2.128       0.000       5.035  tg803418  ca0.1v0.75      1
    2529517  ranger-scratch        1.839       0.000       3.773  lianyuan  usf_fvcom      32
    2526222  ranger-scratch        1.751       0.000       4.738  tg803418  ca0.6v0.75      1
    2528401  ranger-scratch        1.638       0.000       3.397  gnelson   equil_outs      8
    2531086  ranger-work           1.626       0.340      13.370  cliu0001  schmIin_de     15
    2530273  ranger-work           1.563       0.000       5.624  amorris   Thrust4.00      3
    2528400  ranger-scratch        1.528       0.000       3.151  gnelson   equil_midd      8
    2522312  ranger-work           1.089       0.366       3.846  frederic  roms_nep6_     16
    2528755  ranger-scratch        1.003       0.000       2.530  tg458315  h1.5dy0.35      1
    2529262  ranger-scratch        0.992       0.000       2.532  tg459103  pip3            4
    2526210  ranger-scratch        0.973       0.006       2.319  tg803418  ca0.3v0.75      1
    2530934  ranger-scratch        0.913       0.000       5.108  pedram    Shear19        24
    2529294  ranger-scratch        0.895       0.000       2.798  tgaborek  wat-PLL-56      5
    2530294  ranger-share          0.848       0.000      10.359  psr53     CMS21st         4
    2529399  ranger-scratch        0.834       0.000       2.173  tg459103  pip2            4
    2531092  ranger-work           0.755       0.000       7.542  avm       aka2n_74        4
    2527596  ranger-scratch        0.721       0.000      36.401  gmcivor   uli_256        16
    1-37 out of 2730                                           xltop - Wed Apr 25 08:01:15 2012

Build requires ncurses, libcurl, libev-4.4, and libconfuse-2.7.

About

continuous Lustre load monitor

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Roff 98.1%
  • C 1.8%
  • Shell 0.1%