ADMIXTURE and R, color meaning on barplot in studing population ancestry, K value
2
1
Entering edit mode
9.0 years ago
dirranrak ▴ 20

Hi all, I advanced using ADMIXTURE after using Plink and I have got K=5, so I ploted the data with R using col=5. I get the barplot with each bar representing an individual but how can I recognize the color label such as the red color for example and/or which population? because the *.5.Q file just has row or column for the proportions and each individual. Thank you.

R gene • 19k views
1
Entering edit mode

The key to generating the canonical admixture plots in R is to used a stacked barplot with the coancestry coefficients. Is this what you're doing? It's not clear.

0
Entering edit mode

This is the plot from myfile.5.Q using R. I try to find what does mean each color (ancestry population probably? and if it is the case, which population?).

Here is the way I did it:

> tbl=read.table("trialmergedexcludedrs3926405,rs365066AfricaKhoisanexcludedsnpwith0phenotypegeno0.05hwe0.001Batwa_Kiga.913651pos.230samples.PNAS2014_trial1_flip.5.Q")
> barplot(t(as.matrix(tbl)), col=rainbow(5),
+ xlab="Individual", ylab="Ancestry", border=NA)

0
Entering edit mode

Hi, so the thing that I try to do with plink, admixture, and R is to find the ancestry(ies) of some populations in my genomic data.

I just got the proportions for each individual with ADMIXTURE and I ploted these files in R using K=2 to K=5. After that, the plot that I got are with different colors depending on K value, so the first one is with 2 colors and the last one is with 5 colors. the question is now, how can I Know the meaning of each color in each individual proportion?

Thank you so much.

0
Entering edit mode

Dear Dirrank,

did you find the answer to your question?, because I'm in the same situation.

Thank you.

0
Entering edit mode

Dear edison.vazquez, you can try to use a new R packege called BITE. We have implemented 2 different functions to plot Admixture results.

0
Entering edit mode

Hello, did you find something?

Thank you.

2
Entering edit mode
9.0 years ago

I just finished writing an set of functions for admixture plotting:

after sourcing the file point the R function at the directory containing the *.Q and *.fam

plots<-plot.admixture("/Users/zev/Documents/projects/human_diversity/admixture/")


to access subplot 5

plots$5  ADD COMMENT 0 Entering edit mode @Zev.Kronenberg I tried using the function but ran into following error (the last line). Is it something that I'm doing wrong? plots<-plot.admixture("/my/directory/admixture_linux-1.3.0/") Loading required package: reshape2 Loading required package: grid Loading required package: ggplot2 Find out what's changed in ggplot2 at http://github.com/hadley/ggplot2/releases. Loading required package: plyr Loading required package: RColorBrewer Error in file(file, "rt") : invalid 'description' argument  ADD REPLY 0 Entering edit mode Looks like the function is having trouble finding the files. What are the file extensions? ADD REPLY 0 Entering edit mode Standard output files of admixture plus a .fam file (Folder contains File.1.Q to File.10.Q & File.fam). Along with that the folder also has the results of admixture analysis of a different dataset. Could that be a problem? Is there a way to specify a particular inputfile? ADD REPLY 0 Entering edit mode Right now no. It should would if you separate the runs into different folders. Feel free to change the [R] code to specify an input. ADD REPLY 0 Entering edit mode Thanks Zev! ADD REPLY 0 Entering edit mode I moved the results to a different directory and now i get this error: In levels<-(*tmp*, value = if (nl == nL) as.character(labels) else paste0(labels, : duplicated levels in factors are deprecated How to resolve this? ADD REPLY 0 Entering edit mode Can you send me a test dataset? ADD REPLY 0 Entering edit mode @Zev,here is the test data ADD REPLY 0 Entering edit mode @Zev.Kronenberg The code worked once and but gives an error when I ran it the second time. I do not understand why.I tried plotting the same files, it ran fine once though Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 0, 190 Error: with piece 1:  ADD REPLY 0 Entering edit mode I am experiencing the same error. Somebody please explain. I tried to combine .p and .q file using cbind function ADD REPLY 0 Entering edit mode Hi Zev, Thank you for the code. I was busy to understand learning coding other stuff. But now when I try to run your code in R, even just at the beginning I got the error below. And if I remove the quote when specifying the directory, the rest of the code becomes comment. plot.admixture<-function("~/Documents/admixture_macosx-1.23") Error: unexpected string constant in "plot.admixture<-function('~/Documents/admixture_macosx-1.23'"  Do you have an idea why this happen? ADD REPLY 0 Entering edit mode hello you need to add a backslash to your directory like this: "~/Documents/admixture_macosx-1.23/" ADD REPLY 0 Entering edit mode Dear Zev, Thank you. I managed to get the admixture plot using your script. However, the x-axis has the individual samples names. Could it be possible to show the population group as well? the fam file looks the below way where the las column says the group. AC065 AC065 0 0 0 -9 G1 AM236B1 AM236B1 0 0 0 -9 G1 BB011 BB011 0 0 0 -9 G2 BC1021 BC1021 0 0 0 -9 G3 BC1026 BC1026 0 0 0 -9 G3  The current script has the sample names from the second column on X-axis. Could it also include the group names to visualize both the sample names and the group to which it belongs to? ADD REPLY 0 Entering edit mode Hi Zev, Thank you for your code. The function plot.admixture is unable to save as a function in the R environment after running the loop. plot.admixture<-function("/Users/grubent/Documents/Admixture/"){  Therefore when I attempt to plot the results or save the results as you describe here: results <- plot.admixture("directory/")  the function plot.admixture is not saved in my environment. This partly may be to the fact I modified the code here: factor(datframe$Name, levels = datframe$Name, ordered = TRUE) --> factor(datframe$Name, levels = unique(datframe$Name), ordered = TRUE)  Because I was getting this error in the loop Error in levels<-(*tmp*, value = as.character(levels)) : factor level [2] is duplicated | Called from: factor(datframe$Name, levels = datframe\$Name, ordered = TRUE)


Would you be able to help or provide guidance as to what the issue is?

0
Entering edit mode

Hello I am trying to concentrate on the github link provided by you but it is complicated for beginners perspective. I have run admixture analysis with different K values. Now i have original bed, bim, fam files along with .P and .Q files for each analysis. reshape2 is not compatible with R 4.2.2. but tidyr can do the same job. Can you please guide in a simple way to stack the admixture plot. Thank you in advance