Extract SNPs from VCF file located in genes or flanking +/- 5Kbp on GFF file
1
0
Entering edit mode
4.2 years ago
User000 ▴ 690

Hello,

I have a VCF file with SNPs and genes in GFF file. I would like to extract only SNPs that reside on the genes or flanking +/- 5kb. I could artificially add/substract 5kb to gff file and then follow the answer in this link: https://www.biostars.org/p/333734/), but is there an option to do so? Also, is it possible to append the gene name and positions to the vcf file?

vcf gff bedtools • 1.1k views
ADD COMMENT
2
Entering edit mode
4.2 years ago

I'd probably just add the 5kb to the gff via bedtools slop, then intersect with bedtools intersect -wa -wb, and you'd basically have everything you want.

ADD COMMENT
0
Entering edit mode

yeah... the only minus is that I will have the start and stop +/- 5kb, anyway I decided to add only gene information to the vcf file and not position, thank you :)

ADD REPLY
0
Entering edit mode

why in the intersect output vcf file the header is missing?

ADD REPLY
1
Entering edit mode

Bedtools doesn't preserve headers by default. You can try using the --header option (but it has to be in their specific format), or you can pipe the header to your output file first, then append your results to bedtools to that file. Or just open and tack it on afterwards, which is often easiest if you want to keep the headers from both files after an intersect.

ADD REPLY

Login before adding your answer.

Traffic: 3030 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6