Question: Getting The Vcf Consensus For Only A Segment Of The Genome
6.5 years ago
United States
wrote:

I have numerous VCF files of a HIV genomic region and I'm looking to retrieve the resulting consensus sequence from just a region ~50 nt of the alignment so I can feed it to a TF binding prediction program. Normally I'd just use vcf-consensus to output the whole region and then slice the region I want out. However, there are numerous indels (of varying sizes) in this region that are important and I don't want to use that sort of method.

I can't seem to find any flags in the vcf-consensus tool that lets me limit the output.

Any suggestions?

modified 6.5 years ago by swbarnes28.6k • written 6.5 years ago by Will4.5k
6.5 years ago
United States
wrote:

Aside from awk? You can use BEDTools to get the intersect of a vcf and a .bed file with your region.

written 6.5 years ago by swbarnes28.6k

I can easily filter the VCF to get my desired region. But if there are insertions wouldn't bedtools-intersect just give me the 50nt region and not 50 nt + insertions?

modified 9 months ago by RamRS30k • written 6.5 years ago by Will4.5k

Will, how did you solved this? How did you filter the desired region in a .vcf? Thanks!

written 2.4 years ago by boludopublico0
