bcftools view -r issue
1
0
Entering edit mode
7 months ago
ayh • 0

Hi,

I'm trying to extract a region from a VCF file using bcftools view, however no matter what I do it only extracts the whole chromosome, not the region. The chromosome name is farily complex (i.e. not just "chr1"). The command line I'm using is:

bcftools view -r 'gi|417531910|ref|NC_019471.1|,length=62722625':51709128-51711551 input.vcf.gz > output.vcf

Genome used for the VCF is the Oar_3.1 sheep reference.

I've put the chromosome ID in quotes as otherwise the pipes confuse the command line. Including the desired coordinates of the region (51709128-51711551) in the quotes doesn't help.

Any help appreciated!

Matt

bcftools • 561 views
ADD COMMENT
1
Entering edit mode

try to create a BED file with the coordinates and use the option -R

    -R, --regions-file FILE           Restrict to regions listed in FILE

otherwise, try to rename the contigs: with bcftools annotate --rename-chrs

ADD REPLY
0
Entering edit mode
7 months ago

You don't say what your chromosomes are actually called.

As noted in the bcftools documentation, -r takes a comma-separated list of regions. So your command line is asking for

  • All of gi|417531910|ref|NC_019471.1|
  • Bases 51709128 to 51711551 of length=62722625 (which probably is not the name of any of your contigs!)

If this extracts the whole chromosome as you say, then what you need to do is remove the extraneous guff from the comma up to (but not including) the colon.

If this returned no output and your chromosome really is named gi|417531910|ref|NC_019471.1|,length=62722625 then you would have a problem because comma is not a valid character in contig names in VCF files.

ADD COMMENT

Login before adding your answer.

Traffic: 1723 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6