- Author
Bjoern Bastian
To read this file as html, call make doc
.
This is not yet supposed to be a full documentation but rather a minimal guideline for typical usage of the scripts provided.
- Create a topic directory, e.g.
topic
. - Copy
makefiles/Makefile.template
totopic/Makefile
and setprefix
. - Make a copy of
template.mk
inmakefiles
, e.g.topic.mk
. - Make a first git commit with a message
topic: Basic structure
. - Adjust
datadirs
and include directives intopic/Makefile
. - Optionally write scripts, e.g.
makefiles/scripts/scriptname.sh
. - Define targets, rules, macros and default settings in
makefiles/topic.mk
. - In a makefile scripts are called as
$(SCR)/scriptname.sh
. - In a script another script is called as
${scripts}/scriptname.sh
. - Update
INFO
,INFOend
andINFOADD
inmakefiles/topic.mk
. - Foreach target in
INFO
orINFOend
, define a descriptionINFO_target
. - Set default targets and optionally define settings in
topic/Makefile
.
In contrast to the workflow described here, subdirectories of standalone/
are supposed to be used to apply one analysis to data files as they are.
Obtain a new clone of the repository as main directory:
git clone git://github.com/basbjo/ddlangevin_make name_of_working_directory
- To verify the configuration and selected data, use the makefile targets
show
orshowconf
,showdata
andshowmacros
in any directory. - Get a target description with
make info
and usemake -n [target]
to see what make will do when calling a specific target.
Select a series of transformations by setting
projtargets
inconfig.mk
. Currently,colselect
,cossin
,pca
andtica
are available. The first two may be considered as preprocessing, the latter as final transformations. Ifprojtargets
is empty, further analysis is applied directly on the data. To selecttica
, consider calling:git merge origin/tica
which adds a few further changes instead of setting
projtargets
manually.For better performance,
tica
in fact only applies a time-delayed principal component analysis on normalized data thus it requires the principal component analysis results as input data. To apply TICA, you must selectpca tica
.Target Description Suffix colselect Select a range of columns. .ic cossin Select a range of columns and write out cos- and sin-transforms. .cs pca Apply principal component analysis (PCA). .pca tica Apply time-lagged independent component analysis (TICA). .tica To apply a dihedral PCA (dPCA), set
projtargets = cossin pca
. The suffix of the dPCA projected data will then be.cs.pca
.- Put source data (dihedral angles) into main directory and define
TIME_UNIT
, the wildcardRAWDATA
for source data and theIF_FUTURE
value inconfig.mk
as described there. - For multi-trajectory data files (
IF_FUTURE = 1
),colselect
orcossin
are required to avoid taking into account the follower column. E.g. to obtain TICA projections, selectprojtargets = colselect pca tica
. - For
colselect
andcossin
, select the first and last column of the source data to be considered asMIN_COL
andMAX_COL
inconfig.mk
. The default are the first column and the last data column. - For
tica
, selectLAG_TIMES
(unit: time frames) inconfig.mk
.
Perform transformations to obtain projected data to work with:
make
Split projected trajectories before calculating histograms/correlations/etc.:
make split
Subdirectories besides histogram
and correlation
may be used likewise.
To generate histograms, you may first calculate and then plot them:
cd histogram/ make calc make plot
If this does not work, you probably have to call the
split
orminmax
target in the main directory (theminmax
file is used to define compareable bins).To generate correlations, you may first calculate and then plot them:
cd ../correlation/ make estim make calc #alternatively make plot make plot_all
If this does not work, you probably have to call the
split
target in the main directory. Note that the targetestim
must be finished before callingcalc
and the latter before callingplot_all
.To recreate plots after changes in
config.mk
in main directory, call:make del_plots; make plot_all
For convenience, the
plot_all
target should always exist even if it is equivalent to theplot
target.
You can project data and (partially) calculate results in the subdirectories
histogram
andcorrelation
with a oneliner:make; make split; make correlation histogram
where it may be convenient to use
-j [number]
for parallelization. The default make target is called in each subdirectory. If plots and maybe other targets shall be created with the same call, add the wished targets to the variableall
in the subdirectory makefiles. However, incorrelation
it is necessary to finish the targetestim
before callingcalc
and to finish the latter before callingplot_all
.
To obtain a set of down sampled projected trajectories including trajectories with all possible starting points, set
REDUCTION_FACTORS
inconfig.mk
and call:make downsampling
Sets of trajectories with one starting point are saved in
downsampling/
.Down sampled data is by default taken into account by the
split
target but ignored in the subdirectorieshistogram/
andcorrelation/
, seeDATA_LINK
in the subdirectory makefiles.
Go to directory
langevin/
and usually make a copy oftemplate/
:cd langevin/ cp -r template/ new_data/ cd new_data/
Create links to projected data and optionally create files with few columns:
make make file.3cols # example to extract 3 columns from file
When extracting columns, the last column is kept as well if
IF_FUTURE=1
.Provide derived data files and update
localconf.mk
, for example:SPLIT_LIST = *.lang SPLIT_FUTURE = 1
for filenames with the suffix
.lang
and if the last column is 1 or 0 to denote ends of consecutive trajectories (else setSPLIT_FUTURE=0
).Filenames must start with exact names of the projected data files and may contain additional information before the suffix.
Split trajectories by calling
make
ormake split
:make split
To generate histograms, you may first calculate and then plot them:
cd histogram/ make calc make plot
If this does not work, you probably have to call the
split
target in the parent directory orminmax
in the main directory (theminmax
file is used to define compareable bins).If a similar histogram file exists in the
histogram/
subdirectory of the main directory, it is used as reference file to set plot ranges. In case no exactly matching reference file is found, also filenames with different time steps are tried as a reference which is useful when working on down sampled data.To generate correlations, you may first calculate and then plot them:
cd ../correlation/ make estim make calc #alternatively make plot make plot_all
If this does not work, you probably have to call the
split
target in the parent directory. Note that the targetestim
must be finished before callingcalc
and the latter before callingplot_all
.To recreate plots after changes in
config.mk
or when new reference data is provided in the main directory, call:make del_plots; make plot_all
For convenience, the
plot_all
target should always exist even if it is equivalent to theplot
target.- Subdirectories besides
histogram
andcorrelation
may be used likewise. Usemake info
andmake show
to see what will happen. You can split data into single trajectories and calculate results in the subdirectories
histogram
andcorrelation
with a oneliner:make split; make correlation histogram
where it may be convenient to use
-j [number]
for parallelization. The default make target is called in each subdirectory. If plots and maybe other targets shall be created with the same call, add the wished targets to the variableall
in the subdirectory makefiles. However, incorrelation
it is necessary to finish the targetestim
before callingcalc
and to finish the latter before callingplot_all
.
You should generally first read the section Workflow for the main directory although this may not be necessary to work with the subdirectories of the standalone directory.
Subdirectories of standalone/
are supposed to be used to apply one analysis to given data files without using the main directory infrasctructure. Most of these directories are described in this section, while some standalone
specific topics are described in Documentation for specific topics: Standalone directories. Here only the general configuration of the subdirectory makefiles is described.
- langevin
- langevin/template
- downsampling
- correlation
- histogram
- clustering
- drift
- information
- fields (langevin only)
- neighbors (langevin only)
Here only topic directories that are specific to the standalone/
directory are described. See Documentation for specific topics: Topic directories for further topic directory descriptions.
- Data selection and projection
for the main directory is described in Configuration. The projections described below can also be applied by using the respective subdirectories of the
standalone/
directory. See standalone for the general configuration when using the standalone makefiles.- Data splitting and concatenation
The files split.mk and cat.mk are typically included in other makefiles to split multi trajectory data to single trajectories or to concatenate single trajectories to multi trajectory data.
- pca
- tica
- dpca
- dtica
- split
- cat