I would like to run a PCA on SNPs identified within mtDNA for 7 fish (non-model). The SNPs were generated via the Illumina platform, resulting in separate .vcf files for each fish. I believe that I need to merge the .vcf files, but have run into a few problems, and have read that this can create bias. I am hoping someone can help me out by either recommending a way to run a PCA with separate files or a way to merge them.
Notes on .vcf files: My .vcf files are separate files for individual fish, identifiying SNPs in the mtDNA when each was aligned to the mitochondrial genomes of two different species. So, for 7 fish, I have 14 total files with the number of SNPs ranging from 0 to over 1k. These files do not have chromosome numbers, but instead list the reference sequence code.
Can someone please recommend an approach? I have browsed the forums looking for an answer, but quite often people doing this are not working with mtDNA and are working with humans. Any help would be appreciated.