Skip to content

BergerLab/c_cablast_2_0

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is an implementation of CaBLAST in ANSI C built from the C implementation of CaBLASTP found on https://github.com/BurntSushi/c_cablastp. It is made to work like Po-Ru Loh's version of the code, which can be found in a prototype at http://cast.csail.mit.edu/.

The compressed databases created by this code are not in any way compatible with the compressed databases produced by the official implementation. Namely, the data itself should be the same (or similar), but its representation on disk is completely different. There may be other, subtler differences as well.

This code is currently under development.

There are no benchmarks.

The official Go implementation of CaBLASTP can be found here: https://github.com/BurntSushi/cablastp

Installation

c_cablastp depends on three libraries: opt, ds and pthread. pthread should be installed via your system's package manager. opt and ds can be found in Andrew Gallant's clibs respository.

Briefly, the following commands should get c_cablastp into a working state:

mkdir c_cablastp
git clone git://github.com/BurntSushi/clibs
git clone git://github.com/BurntSushi/c_cablastp
cd clibs
make
export C_INCLUDE_PATH=$(pwd)/include
export LIBRARY_PATH=$(pwd)/lib
cd ../c_cablastp
make
./cablast-compress --help

The usage of cablast-compress is:

cablast-compress [flags] database-directory fasta-file [fasta-file ...]

Current progress: Can compress and decompress two same-direction matches, two reverse-complement matches, one same-direction match and one reverse-complement match, and two matches that are at least one chunk apart. On real test data, can currently compress and decompress all ten Brucella sequences from the test file, producing as output coarse FASTA files, links tables, and seeds tables that perfectly match these data in Po-Ru Loh's C++ version of the code, and the order of sequences does not affect whether or not compression and decompression work correctly.

Currently working on implementing search. Expansion of coarse BLAST hits produces almost identical output to Po-Ru's version on a large query data set with 9997 entries. Now working on handling overlapping decompressed sequences.

About

An early implementation of CaBLASTP in C. DO NOT USE.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 96.8%
  • Ruby 3.2%