forked from COMBINE-lab/staden-io_lib
GitHub clone of SVN repo svn://svn.code.sf.net/p/staden/code/io_lib (cloned by http://svn2github.com/)
hzhang1234/staden-io_lib
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
IO_LIB VERSION 1.13.7 ===================== Io_lib is a library of file reading and writing code to provide a general purpose trace file (and Experiment File) reading interface. The programmer simply calls the (eg) read_reading to create a "Read" C structure with the data loaded into memory. It has been compiled and tested on a variety of unix systems, MacOS X and MS Windows. The directories below here contain the io_lib code. These support the following file formats: SCF trace files ABI trace files ALF trace files ZTR trace files SFF trace archives SRF trace archives Experiment files Plain text files SAM/BAM sequence files CRAM sequence files These link together to form a single "libstaden-read" library supporting all the file formats via a single read_reading (or fread_reading or mfread_reading) function call and analogous write_reading functions too. See the file include/Read.h for the generic 'Read' structure. What's new in 1.13.7 ==================== Bug fixes for CRAM, in particular scram_flagstat when running on Cramtools output. Also made some speed ups to the multi-threading support when running under I/O bound conditions. What's new in 1.13.6 ==================== Several more CRAM and BAM bug fixes. CRAM has also had a major restructuring to use more external blocks for data series, potentially improving compression ratios. This will become more important once v3.0 is finalised, but it still helps v2.1 CRAM format a bit too. This change also allows for decoding to be faster when only needing a few columns, such as with scram_flagstat. What's new in 1.13.5 ==================== Two bug fixes to CRAM involving computation of MD5sums for both the @SQ line and also the slice headers. See the CHANGES or ChangeLog file for details. What's new in 1.13.4 ==================== The CRAM specification has updated to version 2.1 and now comes with EOF blocks. This is now the default output version. Scramble now also has a -B option to perform the Illumina lossy 8 way quality-binning. CRAM version 3.0 is under discussion and scramble contains some highly experimental options to deal with this (-J for rANS / arithmetic coding, block level CRC32s), but these are disabled by default and should not be used except for research purposes. Also fixed a few bugs elsewhere, most notably in BAM decoding and index_tar. What's new in 1.13.3 ==================== Another bug fix release, primarily focused around CRAM support. The most significant fixes here are to multi-threading (do not use threading in 1.13.2) and handling of fetching reference sequences. Improved robustness of code too, in particular when facing broken data. What's new in 1.13.2 ==================== This release has various improvements and bug fixes to CRAM support, in addition to experimental multi-threading code for reading and writing BAM/CRAM (but not SAM). By default this is not used, but use the -t option in scramble to enable it. Multi-threading scales well with BAM reading and writing, but CRAM currently has diminishing returns after 4 or 5 threads. What's new in 1.13.1 ==================== This is primarily a bug fix release over 1.13.0 with all changes being in the SAM/BAM/CRAM support. The main new feature is the ability to store unsorted data and to permit non-reference based encoding (albeit not very efficiently). There is also now support for finding references by a colon separated search path (REF_PATH environment variable), which may contain URLs in the same manner that TRACE_PATH does. We can also locally cache files accessed by the MD5 sum to the REF_CACHE environment variable. See the CHANGES file for full details. What's new in 1.13.0 ==================== The library has acquired functions for reading and writing SAM, BAM and CRAM formats along with a command line tool for converting between them named "scramble". See the scramble man page for more information. At the time of writing the CRAM-2.0 specification is in draft form with the official release two months away. This code is subject to any changes that may occur between the current and official CRAM release. Note that it should also be considered beta quality, given a relative lack of testing and real-world experience with CRAM. The command line tools (scramble) are unlikely to change substantially, but the sam, bam cram and "scram" API may undergo considerable changes. So please consider the sam/bam/cram code as beta with the rest of io_lib as a stable release. Older comments ============== In 1.12.x saw various improvements to building and linking, specifically on Fedora and MacOS X plus the use of libtool to create dynamic libraries. The library name is now libstaden-read.so too, as this was already renamed within Debian. We removed illumina2srf and srf2illumina in this release too (they have their own package on SourceForge now). In 1.11.x the SRF support was added. The SRF v1.3 format specification can be found here: http://www.bcgsc.ca/pipermail/ssrformat/attachments/20071209/b0f865a0/ShortSequenceFormatDec9th_v_1_3-0001.doc The ZTR specification changes involve adding some new compression types (the general purpose XRLE2 plus some more solexa specific TSHIT and QSHIFT methods), a region chunk (REGN) to indicate the location of paired-end data stored in a single trace, improved meta-data support for SMP4/SAMP chunks including specifying the baseline (OFFS meta-data tag) and various minor tweaks. There's still a few questions in the ZTR format itself (pending feedback), but what is implemented currently is also what has been described in the docs/ZTR_format file. Finally the directory layout has been greatly simplified with the merging of all the format directories into a single "io_lib" directory and the programs utilising it remaining in the "progs" subdirectory. Building ======== Linux ----- We use the GNU autoconf build mechanism. To build: 1. ./configure "./configure --help" will give a list of the options for GNU autoconf. For modifying the compiler options or flags you may wish to redefine the CC or CFLAGS variable. Eg (in sh or bash): CC=cc CFLAGS=-g ./configure 2. make (or gmake) This will build the sources. CFLAGS may also be changed a build time using (eg): make 'CFLAGS=-g ...' 3. make install The default installation location is /usr/local/bin and /usr/local/lib. These can be changed with the --prefix option to "configure". Windows ------- Under Microsoft Windows we recommend the use of MSYS and MINGW as a build environment. These contain enough tools to build using the configure script as per Linux. Visit http://sourceforge.net/projects/mingw/files/ and download/install Automated MinGW Installer (eg MinGW-5.1.4.exe), MSYS Base System (eg MSYS-1.0.11.exe) and MSYS Supplementary Tools (eg msysDTK-1.0.1.exe). MacOS X ------- The configure script should work by default, but if you are attempting to build FAT binaries to work on both i386 and ppc targets you'll need to disable dependency tracking. Ie: CFLAGS="-arch i386 -arch ppc" LDFLAGS="-arch i386 -arch ppc" \ ../configure --disable-dependency-tracking
About
GitHub clone of SVN repo svn://svn.code.sf.net/p/staden/code/io_lib (cloned by http://svn2github.com/)
Resources
Stars
Watchers
Forks
Packages 0
No packages published
Languages
- C 68.3%
- Roff 18.7%
- Shell 12.4%
- Other 0.6%