Entering edit mode
5.2 years ago
Muha0216 • 0
i recently got back my RNA seq data and the company doing this has provided me with a heatmap accessible by java tree view. In this program, i can 'find genes' to view from the pile of DEGs. However i would like to present about 8 genes from the list of 500 genes and compile them into a separate heatmap for presentation purposes.
i have no idea how to do this! Attached is the screenshot. i can find the genes by inputing the DEG into the find 'gene function' but i cannot do this manually for the other 7 genes and have it compiled into a separate heatmap.
You can highlight different branching structures in the heatmap dendrogram and then export the gene list (i.e. those genes encompassing the highlighted/selected branch) to a list. From there, you can just subset your data matrix and re-cluster using the smaller number of genes, no?
thanks again for always replying.
I wished that was the case but i tried doing the first step of your suggestion even by clicking at the branches on the left hand most panel but they will generate a zoomed in list of heatmaps for some genes. My intention is to select about 8 genes of interest that are randomly located in the main heat map. I tried clicking on the heat map itself and then pressing keyboard 'ctrl' to enable multiple selection but even this is not possible.
screenshot link below after i clicked on some random branches on left hand most panel.
ps: is there any way maybe i can send the treeview files to you? :/
This appears to be a 'bug' in the program: https://bugs.openjdk.java.net/browse/JDK-8189228
Does that bug report match your findings?
I tried my best to understand that link you sent. I guess it is relevant as i cannot even get the ctrl multiple gene selection to work. Is it an unresolvable issue?
I happen to have TreeView-1.1.6 on my computer. If you wanted to make the files available, then I could take a look.
That will help so much. How do i send it to you?
I will pm you
I cant pm you. Lol. Please let me know.
You can get my email by going to my GitHub page (I think).
thanks! i have just sent you an email.
Okay - I will take a look when I get time. Today is Sunday!
I can load the files into TreeView but the problem about 'multi-selecting' different nodes appears to be a general problem that has never been resolved. If you search through the World Wide Web, you'll see that some people have generated their own Java code that can do it. You may want to try there.
Alternatively, I would just recommend to regenerate the TreeView files using the subset of genes, and then load this back into TreeView. Also, you may want to consider using R Programming Language, where there is much greater flexibility than TreeView
Hi thanks kevin.
Thats a bummer. May i know which particular file do i need to edit on to regenerate the subset of genes in tree view? And is this a complicated process? Guess i may need to do a lot of learning.
How did you generate the original files?
The sequencing company provided these data analyses together with running the samples. :/
If you have the original expression matrix, then you can re-generate everything using R Programming Language. Do you have that?
I tried to look up at meaning of expreasion matrix. Is it in any of the zip files i sent you?
Otherwise from my understanding i do have an excel file for each pairwise comparison, with each row representing a gene while columns specify the log2 value of fold change, and another column showing p value.
Is this it?
Let me check when I get home later today.
The log2 and p-value data is not the expression matrix. An expression matrix is usually genes as rows and samples as columns. It contains the normalised expression values for each gene in each sample. If the company has not made that available to you, then that is not very good.
thats bad. Yes please do a check for me. I believe if its there, it should be in the cluster zip file i sent you.
May i ask what is the format of this expression matrix? Excel? When you mention columns being the sample, does sample refer to each biological replicate?
The expression values appear to be in the *.data files; however, there is a different .data file for each comparison. Can you see those?
An expression matrix is usually saved as tab- or comma-separate values (TSV and CSV), and in plain-text format.
I was neither sure why a company would even use TreeView...? - it is an 'old' program and, as I've implied, there are more modern and flexible programs for clustering.
Hi kevin the text box is shrinking so i will start a mew chat below.
Thanks kevin for your latest comment on having found the expression matrix files although its in a strange format. So these files i assume cannot be used to generate or edit the cluster?
I have emailed the company for advice on this. Awaiting their reply but should they not be able to do anything about it. I need to think of worse case scenario. Is it easy to generate the expression matrix from scratch? Is generating a cluster an easy thing to do for beginners?
If you want to get a quick introduction to R, I would recommend taking a look at sme notes that I compiled with some colleagues in the pas: https://github.com/kevinblighe
I do not cover clustering, but you will learn the basics.
For the actual clustering, there is a lot of code on Biostars, again posted by others and I. For example: