Question: Genetic PCA from poolseq genotype file
0
gravatar for AP
20 months ago by
AP90
AP90 wrote:

Hello,

I have a sync file extracted with Popoolation2 software that looks like that:

Contig    Position  Ref    Pool1           Pool2           Pool3           Pool4
SCAFOLD1    11722   A   330:0:0:0:0:0   315:0:0:0:0:0   334:0:0:0:0:0   111:0:0:0:0:0
SCAFOLD1    11723   T   0:330:0:0:0:0   0:316:0:0:0:0   0:334:0:0:0:0   0:111:0:0:0:0
SCAFOLD1    11725   T   0:327:0:0:0:0   0:314:0:0:0:0   0:329:0:0:0:0   0:111:0:0:0:0
SCAFOLD1    11726   A   330:0:0:0:0:0   314:0:0:0:0:0   332:0:0:0:0:0   111:0:0:0:0:0

Each cell contain the allelic counts for each basis (e.g. 330:0:0:0:0:0 for A:T:C:G:N).

I would like to perform a genetic PCA on this dataset just as one would do it on a 012 file extracted with VCFtools. I guess, one could convert the sync file with a single value per cell by adding the total number of non-reference alleles and work from that.

Does anybody have experience with that? Any opinion/comment would be very helpful.

Thanks!

ADD COMMENTlink modified 11 weeks ago by ndiaz0 • written 20 months ago by AP90

Hi, did you find out how to perform the PCA? I also obtained a sync file using popoolations2 and a VCF using GATK and I was trying to perform a PCA using either file... but no success yet. Thank you,

Natalia

ADD REPLYlink written 4 months ago by ndiaz0

I managed following your method. Thanks a million!

ADD REPLYlink written 11 weeks ago by ndiaz0
0
gravatar for AP
4 months ago by
AP90
AP90 wrote:

Hi Natalia,

Yes, I did manage to run a PCA using the sync file. The way I did it was to first calculate the frequency of the minor allele (or the major) of all the SNPs. Then, I ran a PCA on R using prcomp. Instead of the frequency, you can also just use the total count of the minor or major allele. You can also do the same on a 012 file.

Hope that helps! AP

ADD COMMENTlink written 4 months ago by AP90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1024 users visited in the last hour