Can I filter based on chr_n in VCFtools
1
2
Entering edit mode
8.1 years ago

Hello,

I’m working in VCFtools on a .vcf file and am wondering if there is a command to filter based on chr_n <integer>. I would like to filter out chromosomes that have a chr_n less than 560.

Thank-you,

Cherie

SNP VCFtools • 2.2k views
ADD COMMENT
0
Entering edit mode

I refer to the number of total across all individuals for that chromosome when I refer to chr_n, not the length of the read. For example, when I get stats (--freq), my ‘CHROM’ 78860_1 has a ‘chr_n’ of 560. I take it to mean that that site (78860_1, pos 152) is represented across all individuals (280 individuals x 2 chromosomes). What I am looking to do is to filter the .vcf file so that the resulting file only includes sites that are represented in all individuals completely (such as 78860_1, pos 152). Do you know which command does this? Part of my problem is that some of these sites (i.e. ‘CHROM' 78860_1) have SNPs at multiple positions and some of those positions have a N_CHR of 560 while others do not. I will lose some good information if I remove the entire CHROM.

ADD REPLY
0
Entering edit mode
8.1 years ago

not vcftools but Using VCFIlterJS : https://github.com/lindenb/jvarkit/wiki/VCFFilterJS

if 'n' is a length

gunzip -c input.vcf.gz | \
java -jar dist/vcffilterjs.jar -e 'header.getSequenceDictionary().getSequence(variant.getContig()).getSequenceLength() > 560'

if 'n' is a chromosome number chr1, chr2...

gunzip -c input.vcf.gz | \
java -jar dist/vcffilterjs.jar -e 'var n; try { n=parseInt(variant.getContig()); } catch(err) {n=9999;} n<560; ' 
ADD COMMENT

Login before adding your answer.

Traffic: 1953 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6