- Remove duplicates
rm_dup\remove_dup_PE.py
- Split libraries based on barcode.txt
split_library_pairend.py
- Recover fragment for each library
recoverFragment
- Split partners and classify different types of fragments.
split_partner.py
- Algn to the genome for both parts of "paired" fragments.
Stitch-seq_Aligner.py
- Determine the RNA types of different parts within fragment.
RNA_composition.py
- (Alternative) Find liner sequences within the library.
find_linker_new.py
- Determine strong interactions from output of step 5.
Select_strongInteraction_pp.py
for parallel computing;Select_strongInteraction.py
regular.
we can also do bed annotation on different cis-features. bed_annotation.py
- NNNXXXXNN - miRNA - AUC UGG UAA UCC GUA UAA AGU AUG UUG AUG UUC CAA UAA GCA GAU CAU GUU UUU UAA GCC GUC A - mRNA
- NNNXXXXNN - miRNA - UAA GCA GAU CAU GUU UUU UAA GCC GUC A - mRNA
- NNNXXXXNN - miRNA - AUC UGG UAA UCC GUA UAA AGU AUG UUG AUG UUC CAA - mRNA
- NNNXXXXNN - miRNA - AUC UGG UAA UCC GUA UAA AGU AUG UUG AUG UUC CAA
- NNNXXXXNN - UAA GCA GAU CAU GUU UUU UAA GCC GUC A - mRNA
- NNNXXXXNN - mRNA
- NNNXXXXNN - miRNA (less likely)
linkers:
- UAA GCA GAU CAU GUU UUU UAA GCC GUC A
- AUC UGG UAA UCC GUA UAA AGU AUG UUG AUG UUC CAA
* type1:
(reverse primer)
forward reads: XXXX...XXXXNAGATCGGAAGAGCGGTTCAG
||||...||||
reverse reads: TGTGCTGCGAGAAGGCTAGANXXXX...XXXX
(forward primer)
* type2:
forward reads: XXXXX...XXXXXXXXXXX...XXXX
||||...||||
reverse reads: XXXX...XXXXXXXXXXX...XXXX
-
python libraries [python 2.x]:
..* Biopython
..* Pysam
..* BAM2X
..* numpy, scipy
..* parallel python (only forSelect_strongInteraction_pp.py
)
..* PyCogent (for annotation of RNA types) [see Notes]