Question: Exporting data from Genevestigator
1
gravatar for bweil2
7 weeks ago by
bweil210
bweil210 wrote:

Hello,

I want to export data from Genevestigator but it is not available unless you pay for it. I am able to Export an image in the form of JPG, GIF, PDF or PNG file however. It uses a red-scale (as opposed to grey-scale) to show expression of genes in both anatomical structures and sample data. I was wondering if there is an easy way to read this image and convert each square to a RGB value. Doing this by hand would take too long because there are over a million squares between 4 images! If you have any ideas I'd like to know.

Thank you.

A piece of 1 exported image

ADD COMMENTlink modified 9 days ago by RamRS17k • written 7 weeks ago by bweil210
1

Can you not use an alternate tool? Messing with the bitmaps of a million squares is bound to lead to some headaches, if easily feasible.

ADD REPLYlink written 7 weeks ago by genomax56k

Genevestigator is the only tool that can produce this image/data.

ADD REPLYlink written 7 weeks ago by bweil210
1

Genevestigator is the only tool that can produce this image/data.

No, it is not the only tool... not by any means. Look at my own function, here: CorLevelPlot - Visualise correlation results, e.g., clinical parameter correlations

ADD REPLYlink modified 10 days ago • written 10 days ago by Kevin Blighe28k

That looks like some kind of a heatmap. If you are able to export a matrix of values from Genvestigator then you should be able to plot an image like that in R (How to plot a heatmap with two different distance matrices for X and Y or Heatmap based with FPKM values ). Legends on at top and left can be made to look as you have them along with colors you need.

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by genomax56k

I really just need a way to decipher this heat map and put it in number form to show correlation between genes. I can do it by hand in an excel document but I was wondering if there is a faster way.

ADD REPLYlink written 7 weeks ago by bweil210

I really just need a way to decipher this heat map and put it in number form to show correlation between genes.

If by that you mean you don't actually have/can't export the matrix that was used for the generation of this figure, then we can't do much.

ADD REPLYlink written 7 weeks ago by genomax56k
4
gravatar for h.mon
10 days ago by
h.mon19k
Brazil
h.mon19k wrote:

First of all, I should say you should either consider paying for full access, or look for an unrestricted tool that does what you want - it will be a much better option than processing low quality images.

As this seemed like a fun little project for a Friday evening, I tried to extract the information from the image you linked. First I had to find an R package to manipulate and extract information from the image, some googling took me to The magick package: Advanced Image-Processing in R. It has several examples, so it was perfect to get started.

So on to reading the image, which involved:

  • reading the image into R
  • cropping the image to keep only the heatmap

There was some trial and error involved in getting only the heatmap portion of the figure, but it was pretty quick. This is the code:

library(magick)
Genevestigator <- image_read( "Screen_Shot_2018-07-31_at_6.46.11_PM.png" )
print( Genevestigator )
gv.heatmap <- image_crop(Genevestigator, geometry_area(500, 500, 360, 220), repage = FALSE)
gv.heatmap <- image_trim(gv.heatmap)
print( gv.heatmap )
image_info( Genevestigator )
#  format width height colorspace matte filesize density
#1    PNG   242    326       sRGB  TRUE        0   72x72
gv.heatmap.buf <- as.integer( gv.heatmap[[1]] )
dim( gv.heatmap.buf )
#[1] 326 242   4

The resulting object is an array of sRGB values - it can be visualized as a matrix of RGB values, each cell of the matrix corresponding to one pixel. Examining the top of the array, it is possible to see a "square" of identical RGB values from lines 7-15 and columns 5-14, all with the values R=208, G=144, B=144.

I extracted just one column of values, choosing column 9 as it should be good to infer the first column of colours from the heatmap.

fc.rows <- gv.heatmap.buf[ 1:326, 9, 1:3 ]
fc.rows <- data.frame( fc.rows )

Now, with a small block of code so ugly I am embarrassed to post it here (but I will, nonetheless, otherwise there is no answer), I inferred the RGB values referring to the first vertical column of squares. The logic is simple and naive, but worked fine for the image you linked: find blocks of lines with at least four identical lines, this should correspond to the colour of the heatmap squares.

The code is simple (and ugly): initialize the variable holding the current RGB value to a value outside RGB specs, loop over each row of the RGB values, testing for three equal lines in a row. As the count() function from the plyr package returns the frequencies of equal lines, I use this value to infer when there are three identical lines - when this happens for the first time for each block of identical values, the RGB values are entered into the filteredRGB variable. The first if / else block guarantees just one RGB value per block is filtered, and changes currentRGB when it is different from the current RGB value.

library(plyr)
library(data.table)

filteredRGB <- NULL
currentRGB <- c(256,256,256)
for ( i in 1:nrow( fc.rows ) ){
  if ( sum( currentRGB == fc.rows[i,] ) == 3 ){
    next
  } else { 
    currentRGB <- fc.rows[i,]
  }
  if ( count( data.frame( fc.rows[i:(i+2),] ), vars = c("X1", "X2", "X3") )$freq == 3 ){
    currentRGB <- fc.rows[i,]
    filteredRGB <- rbind( filteredRGB, currentRGB )
  }
}

Now plot the values to check if extraction worked:

colours <- rgb( filteredRGB[,1:3] / 255 )
plot(1:23,rep(1,23), col = colours, pch = 16, cex = 3 )

RGB values

Yay, it seems it worked! Ok, I extracted just the first column of values, but it shouldn't be hard to extend the code to extract all columns. One idea would be to modify the code above to get "good" index columns for RGB value extraction for each column of the heatmap (midpoint position for each block of identical values in a row), then loop over these indices to extract the values for each row.

ADD COMMENTlink written 10 days ago by h.mon19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1024 users visited in the last hour