branchrates

Branchrates calculates branch-specific evolutionary rates given a set of taxa, a phylogenetic tree for those taxa, and a set of homologous traits across all taxa. It has been quiet for a while.

Load input data by calling functions, for taxa (taxon.matrix.read_file()), a phylogenetic tree for those taxa with branch lengths (tree.read_internal_tree), a set of homologous traits across all given taxa (trait_matrix.read_file), and a mapping of branch-level parameters to true parameters for each branch of the tree (ratep_map.read_file()).

Please read the PDF documentation for an overview of mapping concepts, as my use of that introduces probably the most confusing aspects of the implementation. Each branch of the tree has two parameters for character evolution, a forward rate (character gain) and a backward rate (character loss). The pair of rates are managed as a unit, with each branch having an id for the pair and a pointer to the single instantiation of BranchRateManager which is queried for the rates. BranchRateManager manages the mapping between these branch-level parameters and the "true" parameters. The "true" parameters are kept in a private instantiation of RatePVector, which is initialized based on the mapping described in a RatePMap instantiation, by calling BranchRateManager.allocate_ratep_from_map().

When computing the likelihood of a tree, the branch-level rates are retrieved from BranchRateManager using the id for the pair. In contrast, parameter adjustments made while maximizing the likelihood are done directly to the RatePVector through the BranchRateManager. Either way, the value is ultimately held in the RatePVector, it's just that the branch-level rates are abstracted via the mapping.

The only current maximization implementation that I have confidence in for a large number of parameters is the Nelder-Mead downhill simplex method, as implemented in the ML_multi_DownhillSimplex class. The current implementaton is not licences for public distribution and will be replaced in favor of the one from the Gnu Scientific Library.

Current Setup

The current branchrates uses a dataset from the Koonin lab to calculate a number of evolutionary parameters for intron birth and death. The phylogenetic tree used is one I determined using PAUP, with branch lengths multiplied by 100. This tree is kept in an internal array because I don't yet have a method for reading an external tree, one is partially present but not completed. The list of taxa, the trait matrix, and the parameter mapping are all kept in external files as is clear from main(). Output generated includes a summary of the TraitMatrix, the RatePMap, the PhyloTree, and then the output of the maximization process. Each "amoeba()" line includes the low and high likelihoods computed among the vertices of the simplex, which can be used to follow the minimization process. I've allowed for 30 restarts, each restart is initiated when the difference between the likelihoods within the simplex falls below a minimum. At the end of each restart, the current estimates of the parameter values are printed. This takes quite a while to finish, given the 30 restarts. The method ML_multi::profile_likelihoods_print prints out parameter profiles surrounding each parameter estimate based on chi-square likelihood ratio criteria.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
BranchRateManager.cpp		BranchRateManager.cpp
BranchRateManager.h		BranchRateManager.h
GSL.h		GSL.h
HistoryMultiParameter.h		HistoryMultiParameter.h
HistorySingleParameter.cpp		HistorySingleParameter.cpp
HistorySingleParameter.h		HistorySingleParameter.h
Likelihood.cpp		Likelihood.cpp
Likelihood.h		Likelihood.h
ML_multi.cpp		ML_multi.cpp
ML_multi.h		ML_multi.h
ML_multi_DownhillSimplex.cpp		ML_multi_DownhillSimplex.cpp
ML_multi_DownhillSimplex.h		ML_multi_DownhillSimplex.h
ML_multi_Powell.cpp		ML_multi_Powell.cpp
ML_multi_Powell.h		ML_multi_Powell.h
ML_multi_QuasiNewton.cpp		ML_multi_QuasiNewton.cpp
ML_multi_QuasiNewton.h		ML_multi_QuasiNewton.h
ML_single.cpp		ML_single.cpp
ML_single.h		ML_single.h
ML_single_NewtonRaphson.cpp		ML_single_NewtonRaphson.cpp
ML_single_NewtonRaphson.h		ML_single_NewtonRaphson.h
Makefile		Makefile
PhyloTree.cpp		PhyloTree.cpp
PhyloTree.h		PhyloTree.h
PhyloTreeNode.cpp		PhyloTreeNode.cpp
PhyloTreeNode.h		PhyloTreeNode.h
README.md		README.md
RatePMap.cpp		RatePMap.cpp
RatePMap.h		RatePMap.h
RatePVector.cpp		RatePVector.cpp
RatePVector.h		RatePVector.h
Simulate.cpp		Simulate.cpp
Simulate.h		Simulate.h
Simulate.h.bak		Simulate.h.bak
TaxonMatrix.cpp		TaxonMatrix.cpp
TaxonMatrix.h		TaxonMatrix.h
TraitMatrix.cpp		TraitMatrix.cpp
TraitMatrix.h		TraitMatrix.h
main.cpp		main.cpp
main.cpp.for_simulation		main.cpp.for_simulation

douglasgscofield/branchrates

Folders and files

Latest commit

History

Repository files navigation

branchrates

Current Setup

About

Resources

Stars

Watchers

Forks

Languages