convert vcf contigs
1
0
Entering edit mode
9.0 years ago
alex ▴ 250

So I have a vcf file with contigs of 1,2,3,4, etc. I would like to convert them to chr1,chr2,chr3,chr4,etc for some downstream analysis I have. Is there a recommended way of doing this? Thanks!

vcf • 2.8k views
ADD COMMENT
1
Entering edit mode
9.0 years ago
Ram 43k

Please take a look at this post: Changing Chromosome Notation On Vcf

A simple Regex should suffice for the variant lines, but you may wish to address the header with more caution - you'll need to change the location of the reference and the details on the contig lines.

EDIT: Or, use this to overcome the problem where ref with contigs 1,2,3... might differ from the ref with contigs chr1,chr2,chr3... This might prove safer in the longer run.

ADD COMMENT
0
Entering edit mode

That change should only be changing what the values are to from "n" to "chrn" in the header no? We have about 30 contigs in the header so can easily be manually done (its just the dbsnp file we need to edit)

ADD REPLY
0
Entering edit mode

The sed command can be tailored to change all non-header lines, but matching it to lines that do not begin with a #. And yes, manual editing of the contig and ref lines is a good idea.

ADD REPLY

Login before adding your answer.

Traffic: 1865 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6