Skip to content

jfjlaros/snp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Finding non-biallelic SNPs

SNPs are used as markers for various population genetic purposes. In particular non-biallelic SNPs are of interest for identification purposes; to have the same discriminative power, more biallelic than non-biallelic SNPs are required.

In this project, we present a tool that uses the NCBI public database dbSNP to identify non-biallelic SNPs.

This work was published in FSI Genetics in 2009.

Installation

Install the expat development files:

apt-get install libexpat1-dev

Retrieve the source code and compile the program:

git clone https://github.com/jfjlaros/snp.git
cd snp/src
make

Usage

The program requires a dump of the database in XML format. These files are typically found in the subfolder named genotype of any of the builds hosted on the download site of the NCBI.

For a file named gt_chrXX.xml.gz, use the following command to find the SNP candidates:

  zcat gt_chrXX.xml.gz | ./snp <threshold> > output.txt

The treshold parameter is used to specify the minimum allele frequency (in percentages). If this option is omitted, the threshold defaults to 0. By increasing this variable the amount of output can be greatly reduced, setting it to 1 or higher is recommended.

Related work

Some notes on allele frequencies on the X- and Y chromosomes.

Inspired by this research, we looked into the degradation of methylated nucleotides for similar purposes.

About

Non-biallelic SNPs for population genetics and forensics.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published