Typical Unix-like environment with basic development tools - all other dependencies will be fetched automatically.
curl
gzip
gcc
- GNU
make
These tools should be part of every Linux distribution.
- automatic dependency injection with Makefile rules, e.g. the dependency BWA is only downloaded & compiled if needed
- fully automated installation of required tools from their source
- if possible, wildcard patterns instead of rules (e.g. .bwt runs bwt)
- make will generate a dependency chain for our data and regenerate outdated files
- be able to run the newest, latest software on any machine
- use versioning for all dependencies to all reproducibility
- use make -j8 (will automatically parallelize wherever possible)
- we must be able to run the newest software on old clusters, hence everything needs to be built from scratch
- some software licenses don't allow binary releases (GATK)
- we need to be able to inspect & modify the source code
- we want to understand every bit of out pipeline
- if something doesn't build, it's outdated anyhows
Dependencies are automatically generated, e.g.
- bwa -> bwt files
- mason_variator needs fai
- GATK needs fai and dict
build
: all required build tools and source code of required programsdata
: all experimental resultsdebug
: temporary folder that is used to print some debug information for testsperm
: permanent data that shouldn't be deletedprogs
: all required executables of the pipelinesrc
: this pipeline's source code
make all
: runs the entire pipelinemake test
: runs D unittests
Pick out any intermediate file in the pipeline and run make <file>
to generate
it
- due to limitations in the Makefile the working directory must be the directory of this Makefile
- the folder names are coded statically on purpose