weallen/Seqwill
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
TODO - Write stuff for reading/writing metadata to tracks in hdf5 file - Write wrapper around TrackIO to make TrackFile - Write SparseTrack w/ Block and MemoryManager to allocate Blocks efficiently w/ anonymous mmap'd memory arena - - Refactor Track inferface into == DESIGN NOTES == === BAM LOADING === Done using the BamTools library. === PARALLELIZATION == Will be done using Intel's Thread Building Block library. === HDF5 === There are two likely kinds of data access patterns for the data stored in the HDF5 files: 1. Iteration: Loop over each chromosome, and load some subset of the tracks on each chromosome. E.g. use in an HMM. 2. Random access: Load some subset of the tracks on some slice of some chromosome (less than the entire chromosome). May also be looping over the chormosomes, and looping over subsets of the chromosome; in which case, may make sense to load the whole chromosome? E.g. use when comparing specific subsets. Same holds for seq data. Structure of the HDF5 files: / Attr: TrackNames, ChrNames /chr1 data -- 2D array -- cols are base positions, rows are tracks /seq chr1 -- 1D array, cols are base positions ... chrX == COPYRIGHT NOTES == Most files in the HDF5 directory are from the Pacific Biosciences SMRTanalysis software, and so are (C) Pacific Biosciences 2010.
About
High-throughput sequencing storage and analysis framework
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published