ADMIXTURE and R, color meaning on barplot in studing population ancestry, K value
2
1
Entering edit mode
5.6 years ago
dirranrak ▴ 20

Hi all, I advanced using ADMIXTURE after using Plink and I have got K=5, so I ploted the data with R using col=5. I get the barplot with each bar representing an individual but how can I recognize the color label such as the red color for example and/or which population? because the *.5.Q file just has row or column for the proportions and each individual. Thank you.

R gene • 10k views
ADD COMMENT
1
Entering edit mode

The key to generating the canonical admixture plots in R is to used a stacked barplot with the coancestry coefficients. Is this what you're doing? It's not clear.

ADD REPLY
0
Entering edit mode

This is the plot from myfile.5.Q using R. I try to find what does mean each color (ancestry population probably? and if it is the case, which population?).

Here is the way I did it:

> tbl=read.table("trialmergedexcludedrs3926405,rs365066AfricaKhoisanexcludedsnpwith0phenotypegeno0.05hwe0.001Batwa_Kiga.913651pos.230samples.PNAS2014_trial1_flip.5.Q")
> barplot(t(as.matrix(tbl)), col=rainbow(5),
+ xlab="Individual", ylab="Ancestry", border=NA)
ADD REPLY
0
Entering edit mode

Hi, so the thing that I try to do with plink, admixture, and R is to find the ancestry(ies) of some populations in my genomic data.

I just got the proportions for each individual with ADMIXTURE and I ploted these files in R using K=2 to K=5. After that, the plot that I got are with different colors depending on K value, so the first one is with 2 colors and the last one is with 5 colors. the question is now, how can I Know the meaning of each color in each individual proportion?

Thank you so much.

ADD REPLY
0
Entering edit mode

Dear Dirrank,

did you find the answer to your question?, because I'm in the same situation.

Thank you.

ADD REPLY
0
Entering edit mode

Dear edison.vazquez, you can try to use a new R packege called BITE. We have implemented 2 different functions to plot Admixture results.

ADD REPLY
0
Entering edit mode

Hello, did you find something?

Thank you.

ADD REPLY
2
Entering edit mode
5.6 years ago

I just finished writing an set of functions for admixture plotting:

https://github.com/jewmanchue/ZevRTricks/blob/master/Addmixture2.plots.R

after sourcing the file point the R function at the directory containing the *.Q and *.fam

plots<-plot.admixture("/Users/zev/Documents/projects/human_diversity/admixture/")

to access subplot 5

plots$`5`
ADD COMMENT
0
Entering edit mode

@Zev.Kronenberg

I tried using the function but ran into following error (the last line). Is it something that I'm doing wrong?

plots<-plot.admixture("/my/directory/admixture_linux-1.3.0/")
Loading required package: reshape2
Loading required package: grid
Loading required package: ggplot2
Find out what's changed in ggplot2 at
http://github.com/hadley/ggplot2/releases.
Loading required package: plyr
Loading required package: RColorBrewer
Error in file(file, "rt") : invalid 'description' argument
ADD REPLY
0
Entering edit mode

looks like the function is having trouble finding the files.  What are the file extensions? 

ADD REPLY
0
Entering edit mode

Standard output files of admixture plus a .fam file (Folder contains File.1.Q to File.10.Q & File.fam). Along with that the folder also has the results of admixture analysis of a different dataset. Could that be a problem? Is there a way to specify a particular inputfile?

ADD REPLY
0
Entering edit mode

Right now no. It should would if you separate the runs into different folders.  Feel free to change the [R] code to specify an input.  

ADD REPLY
0
Entering edit mode

Thanks Zev! 

ADD REPLY
0
Entering edit mode

I moved the results to a different directory and now i get this error: In levels<-(*tmp*, value = if (nl == nL) as.character(labels) else paste0(labels, : duplicated levels in factors are deprecated How to resolve this?

ADD REPLY
0
Entering edit mode

Can you send me a test dataset?

ADD REPLY
0
Entering edit mode

@Zev,here is the test data

ADD REPLY
0
Entering edit mode

@Zev.Kronenberg The code worked once and but gives an error when I ran it the second time. I do not understand why.I tried plotting the same files, it ran fine once though

Error in data.frame(..., check.names = FALSE) : 
arguments imply differing number of rows: 0, 190
Error: with piece 1:
ADD REPLY
0
Entering edit mode

Hi Zev,

Thank you for the code. I was busy to understand learning coding other stuff. But now when I try to run your code in R, even just at the beginning I got the error below. And if I remove the quote when specifying the directory, the rest of the code becomes comment.

plot.admixture<-function("~/Documents/admixture_macosx-1.23")
Error: unexpected string constant in "plot.admixture<-function('~/Documents/admixture_macosx-1.23'"

Do you have an idea why this happen?

ADD REPLY
0
Entering edit mode

hello you need to add a backslash to your directory like this: "~/Documents/admixture_macosx-1.23/"

ADD REPLY
0
Entering edit mode

Dear Zev, Thank you. I managed to get the admixture plot using your script. However, the x-axis has the individual samples names. Could it be possible to show the population group as well? the fam file looks the below way where the las column says the group.

`AC065 AC065 0 0 0 -9 G1
 AM236B1 AM236B1 0 0 0 -9 G1
 BB011 BB011 0 0 0 -9 G2
 BC1021 BC1021 0 0 0 -9 G3
 BC1026 BC1026 0 0 0 -9 G3`

The current script has the sample names from the second column on X-axis. Could it also include the group names to visualize both the sample names and the group to which it belongs to?

ADD REPLY
0
Entering edit mode

Hi Zev,

Thank you for your code. The function plot.admixture is unable to save as a function in the R environment after running the loop.

plot.admixture<-function("/Users/grubent/Documents/Admixture/"){

Therefore when I attempt to plot the results or save the results as you describe here:

results <- plot.admixture("directory/")

the function plot.admixture is not saved in my environment. This partly may be to the fact I modified the code here:

factor(datframe$Name, levels = datframe$Name, ordered = TRUE) --> factor(datframe$Name, levels = unique(datframe$Name), ordered = TRUE)

Because I was getting this error in the loop

Error in `levels<-`(`*tmp*`, value = as.character(levels)) : factor level [2] is duplicated | Called from: factor(datframe$Name, levels = datframe$Name, ordered = TRUE)

Would you be able to help or provide guidance as to what the issue is?

ADD REPLY

Login before adding your answer.

Traffic: 3016 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6