Skip to content

AmaliT/Tangram

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

=========================================================================
Tangram 0.3.0        Release Distribution Documentation        2013-06-17
Author: Jiantao Wu (jiantaowu.xining@gmail.com)
        Wan-Ping Lee (wanping.lee@bc.edu)
Marth Lab [1], Boston College Biology Department
=========================================================================


Introduction
=========================

Tangram is a C/C++ command line toolbox for structural variation(SV) 
detection. It takes advantage of both read-pair and split-read algorithms 
and is extremely fast and memory-efficient. Powered by the Bamtools API 
[3], Tangram can call SV events on multiple BAM files (a population) 
simutaneously to increase the sensitivity on low-coverage dataset. 
Currently it reports mobile element insertions (MEI). More other SV event 
types will be introduced soon. For SNP calling and short INDEL calling, 
please check an other toolbox from our lab: FreeBayes[4].


Obtaining and Compiling
=========================

> git clone git://github.com/jiantao/Tangram.git
> cd src
> make


Detection pipeline
=========================

Currently, Tangram contains six sub-programs:

0. tangram_bam    : If the input bam files are not generated by MOSAIK [2],
                    tangram_bam will add ZA tags that are necessary for the
		    following steps.

1. tangram_scan   : Scan through the bam file and calculate the fragment 
                    length distribution for each library in that bam file. 
                    It will output the fragment length distribution files 
                    for each input bam file.

2. tangram_merge  : If more than one bam files need to be scanned, this 
                    program will combine all the fragment length distribution 
                    files together. It will output the merged fragment length 
                    distribution file that enable the detection of multiple 
                    bam files simutaneously. This step is optional if only one 
                    bam file (pooled bam file) was used.

3. tangram_index  : Index the normal and special (MEI sequences) reference 
                    file. It will output the indexed refrence file. This step 
                    is required for split read algorithm.

4. tangram_detect : Detect and genotype the SV events from the MOSAIK aligned 
                    BAM files. It will output the unfiltered VCF files.

5. tangram_filter : Filter the raw VCF file generated by the detector.
                    NOTE: this program requires the windowBed 
                    (from bedtools) [5], Unix sort and grep to be in the 
                    default path.

The overall detection pipeline for Tangram looks like the following

tangram_bam
(BAM Input)
      \
       \
   tangram_scan  \
   (BAM Input)    \
                   -----> tangram_detect --> tangram_filter --> VCF file(s)
                  /       (BAM input)
   tangram_index /
   (Ref Fasta)

For the detailed usage of each program, please run "$PROGRAM -help"


Bug Report
=========================

Please report bugs using the built-in bug reporting feature in github or 
by sending the authors an email.


References
=========================

[1] http://bioinformatics.bc.edu/marthlab/Main_Page 
[2] https://github.com/wanpinglee/MOSAIK 
[3] https://github.com/pezmaster31/bamtools
[4] https://github.com/ekg/freebayes
[5] http://code.google.com/p/bedtools

About

Fast Structural Variation Detection Toolbox

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published