Question: merging .vcf files for mitochondrial SNPs
gravatar for jmw
2.7 years ago by
jmw10 wrote:

I would like to run a PCA on SNPs identified within mtDNA for 7 fish (non-model). The SNPs were generated via the Illumina platform, resulting in separate .vcf files for each fish. I believe that I need to merge the .vcf files, but have run into a few problems, and have read that this can create bias. I am hoping someone can help me out by either recommending a way to run a PCA with separate files or a way to merge them.

Notes on .vcf files: My .vcf files are separate files for individual fish, identifiying SNPs in the mtDNA when each was aligned to the mitochondrial genomes of two different species. So, for 7 fish, I have 14 total files with the number of SNPs ranging from 0 to over 1k. These files do not have chromosome numbers, but instead list the reference sequence code.

Can someone please recommend an approach? I have browsed the forums looking for an answer, but quite often people doing this are not working with mtDNA and are working with humans. Any help would be appreciated.


sequence next-gen • 847 views
ADD COMMENTlink modified 2.7 years ago by Jeremy Leipzig19k • written 2.7 years ago by jmw10
gravatar for Jeremy Leipzig
2.7 years ago by
Philadelphia, PA
Jeremy Leipzig19k wrote:

Just convert your VCFs GT calls into a big table and run your PCA on that. You just need a 0 or a 1 for every sample and position in which any sample has a variant.

ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by Jeremy Leipzig19k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1391 users visited in the last hour