Question: heatmap with plink genome file
1
gravatar for sonia.olaechea
13 days ago by
sonia.olaechea90 wrote:

Hi all!

I'm trying to do a heatmap with plink's output for IBD calculation (.genomefile). This file has several columns, but only three are important to me, so my parsed input format looks like this:

IID1    IID2    PI_HAT
ID1     ID2     0.0163
ID1     ID3     0
ID1     ID4     0.0155
ID2     ID1     0.0096
ID2     ID3     0.0125
ID2     ID4     0.475
...

I would like to do a heatmap with the PIHAT values for all my IDs (I have hundreds). To my understanding, Rasks for a matrix as input for the plot, but I'm not able to parse this input into the correct format (I actually get the heatmap, but all the values are wrong). Could someone please give me advice or if there's another way to do the plot, I'd be as well to try it. Thank you very much in advance!

ibd plink R • 63 views
ADD COMMENTlink modified 13 days ago by zx87547.3k • written 13 days ago by sonia.olaechea90
2
gravatar for zx8754
13 days ago by
zx87547.3k
London
zx87547.3k wrote:

Input data for heatmap should be numeric matrix, your data is long format data, we need to reshape it to wide, then convert to matrix for plotting, see example:

library(reshape2)

# example data
df1 <- read.table(text = "IID1    IID2    PI_HAT
ID1     ID2     0.0163
ID1     ID3     0
ID1     ID4     0.0155
ID2     ID1     0.0096
ID2     ID3     0.0125
ID2     ID4     0.475", header = TRUE, stringsAsFactors = FALSE)


#convert long-to-wide
x <- dcast(df1, IID1 ~ IID2, value.var = "PI_HAT")

# convert to matrix with column AND rownames
myM <- as.matrix(x[ , -1 ])
row.names(myM) <- x$IID1

# I am converting all NAs to 0, reconsider if this is suitable in your case.
myM[ is.na(myM) ] <- 0

#then plot
heatmap(myM)

ADD COMMENTlink modified 13 days ago • written 13 days ago by zx87547.3k

Woooow, thank you very much, it's working! Just for me to understand... what is the difference between long and wide format?? I hadn't heard before about this before.

ADD REPLYlink written 13 days ago by sonia.olaechea90
1

This is a common problem, reshaping the data from wide-to-long or from long-to-wide. Depending on the problem you might want to convert between them, e.g.: ggplot prefers data in long format, and heatmap better to have wide. See below StackOverflow posts for more info:

ADD REPLYlink written 13 days ago by zx87547.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1045 users visited in the last hour