Question: getting chromosome lengths from a vcf file
0
gravatar for outlier95
3.4 years ago by
outlier9510
outlier9510 wrote:

I have a VCF file that includes variant and invariant sites for every locus. Is there an easy way to obtain the lengths for each these?

chromosomes vcf • 1.5k views
ADD COMMENTlink modified 3.4 years ago by swbarnes26.9k • written 3.4 years ago by outlier9510

Every invariant site? If that's the case, just use this logic: SELECT chr,(MAX(pos)-MIN(pos)) GROUP BY chr

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by RamRS24k
2
gravatar for swbarnes2
3.4 years ago by
swbarnes26.9k
United States
swbarnes26.9k wrote:

Check the vcf header. I have lots of .vcf files with the lengths of each contig in them.

ADD COMMENTlink written 3.4 years ago by swbarnes26.9k

Is there a way to get the lengths only for the contigs with reads mapped? I can edit my initial question if necessary.

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by outlier9510

A vcf doesn't really care about your read mapping. It's well past that point. If you want information about the reference genome, look there, or maybe the SAM header.

ADD REPLYlink written 3.4 years ago by karl.stamm3.5k

Well the contig lengths in the vcf swbarnes2 is referring to are all the reference contigs, regardless if reads supported a given contig (at least in my vcf file). I just want the lengths of all the contigs from the CHROM column.

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by outlier9510
2

You're saying there's more contigs in the VCF header than in your CHROM column? Then you can trust the VCF header and just use the subset you're interested in. Either way, going to an earlier stage is less error prone, look at the reference genome itself, or the SAM header for information related to those things. the VCF probably copied the data from there.

ADD REPLYlink written 3.4 years ago by karl.stamm3.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1777 users visited in the last hour