Skip to content

mdmohsinali/SGCT-Based-Fault-Tolerant-2D-SFI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

/************************************************************************************ 
 * File       : README                                                              *
 * Description: Contains detailed information about how to use the code             *
 *              to execute the 2D fault-tolerant Solid Fuel Ignition application    *
 * Author     : Md Mohsin Ali (mohsin.ali<AT>anu.edu.au)                            *
 * Created    : December 2015                                                       *
 * Updated    : September 2016                                                      *
 ************************************************************************************/

1.  HOW TO RUN?
    
    make clean; make; make run2d;

2.  WHAT ARE THE PARAMETERS OF THE PROGRAM?

    Execute the following to find out the parameters to be passed
    ./app_raijin_cluster -h

3.  WHAT IS THE TESTED SYSTEM CONFIGURATION?

    (a) Git revision icldistcomp-ulfm-46b781a8f170 (as of 13 December 2014)
    (b) gcc version 4.6.4 (GCC)
    (c) PETSc (version 3.5.2 and 3.6.4) was installed with the following configuration
        ./configure --prefix=PATH --with-mpi-dir=PATH PETSC_ARCH=linux-gnu \
        --download-fblaslapack=1 --with-shared-libraries=1 PETSC_DIR=PATH/petsc \
        PETSC_ARCH=linux-gnu --with-64-bit-indices=0
    (d) Head node of Raijin cluster (http://nci.org.au/~raijin/)

4.  IS THERE ANY ALTERNATIVE VERSION OF ULFM MPI TO TEST WITH?

    Yes. If the above git revision does not meet the requirements, there is
    another way as follows to try.

    This needs a combination of files from two git revisions. Install
    git revision icldistcomp-ulfm-3bc561b48416 (as of Mid-June 2013)
    with the following updates from git revision 
    icldistcomp-ulfm-c351e0e792a8 (as of 21 May 2014).

    (a) /ompi/mca/coll/ftbasic/coll_ftbasic_agreement_earlyterminating.c
    (b) /ompi/mca/coll/base/coll_tags.h
    (c) /ompi/mca/coll/ftbasic/coll_ftbasic.h
    (d) /ompi/mca/coll/ftbasic/coll_ftbasic_component.c
    (e) /ompi/mca/coll/ftbasic/coll_ftbasic_module.c
    (f) /ompi/communicator/comm_ft.c
    (g) /ompi/communicator/comm_ft_revoke.c
    (h) /ompi/errhandler/errhandler_rte.c
    (i) /ompi/mca/pml/ob1/pml_ob1.c

5.  WHAT CAN I DO IF THE EXECUTION HANGS ON?

    The sleep times in "ProcsFailureRecovery.cpp" file could be increased
    to fix this problem. The default sleep times are set as "sleep(2);"
    and "sleep(5);".

6.  WHAT CHANGES ARE NEEDED TO RUN ON COMPUTE NODES?

    Remove single-line comment from "#define RUN_ON_COMPUTE_NODES" 
    in "FailureRecovery.cpp" file and run program with a machinefile 
    named "hostfile"

7.  WHAT CHANGES ARE NEEDED TO SPAWN REPLACEMENT PROCESSES ON SPARE NODES TO HANDLE
    NODE FAILURES?

    Remove single-line comment from "#define RUN_ON_COMPUTE_NODES" and 
    "#define RECOV_ON_SPARE_NODES" in "FailureRecovery.cpp" file and run 
    program with a machinefile named "hostfile"

8.  IS THERE ANY LICENSE TO USE THE CODE?

    Yes. See the LICENSE file.

9.  IS THERE ANY OTHER WAY TO ACKNOWLEDGE THE USE OF THE CODE?

    Yes. Cite the following article and PhD Thesis.
    @article{ali-sgct-ft-app-ijhpca-2016,
        author = {Ali, Md Mohsin and Strazdins, Peter E and Harding, Brendan and Hegland, Markus},
        title = {Complex Scientific Applications Made Fault-Tolerant with the Sparse Grid Combination Technique},
        journal = {International Journal of High Performance Computing Applications (IJHPCA 2016)},
        volume = {30},
        number = {3},
        pages = {335--359},
        doi = {10.1177/1094342015628056},
        days = {},
        month = {},
        year = {2016},     
    }

    @PhdThesis{ali:2016:hpcftpdes,
        author = {Ali, Md Mohsin},
        title = {High Performance Fault-Tolerant Solution of PDEs 
                 using the Sparse Grid Combination Technique},
        school = {The Australian National University},
        year = {2016},
    }

10. HOW TO CONTACT IF THERE IS ANY PROBLEM?

    Send an email to mohsin.ali<AT>anu.edu.au 

About

C/C++ and Fortran 90 source code for the sparse grid combination technique based 2D fault-tolerant Solid Fuel Ignition application

Resources

License

Unknown, BSD-3-Clause licenses found

Licenses found

Unknown
LICENSE
BSD-3-Clause
LICENSE_FT_CODE

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published