Question: Creating a Venn diagram from RNA seq data
0
gravatar for Bolesaem
9 days ago by
Bolesaem0
Bolesaem0 wrote:

First post here. Hello everyone!

This is the first time I have to use bioinformatics, so I apologize but this is going to be a very basic question. I would like to create a Venn diagram to see which genes are commonly expressed between two or three treatments. I have the excel sheets with p values and gene ids of differentially expressed genes between treatments. I have one excel file for each comparison.

I don't seem to get how it works because I upload the file with gene IDs in BioVenn (as an example) but doesn't give me any response (might be a technical issue). More importantly I'm not understanding how I can create a Venn diagram uploading a single file where the comparison of two conditions was done already. I thought I should use individual gene counts, but I don't know how to start with it.

Sorry for the dumb question but I am utterly confused.

Thanks!

diagram rna-seq venn basic • 194 views
ADD COMMENTlink modified 7 days ago • written 9 days ago by Bolesaem0
2

A Venn diagram isn't really going to tell you which genes are commonly expressed between conditions, just how many. Are you sure a Venn diagram is what you want? Or are you trying to determine common sets of differentially expressed genes between the different treatment comparisons? Regardless, we need more info as to what you're tried. Did the list you uploaded for each set contain only the IDs, one per row?

ADD REPLYlink written 9 days ago by jared.andrews071.7k

Ok sorry I realized my question wasn't very clear.

I would like to know how many genes are commonly or differentially expressed in different treatments (it's about stem cell biology, I would like to show how similar are two different developmental stages). So I thought of a Venn diagram. I have several excel tables obtained after DESeq2 analysis, were different pairwise comparisons were done. Yes, each row contains the IDs and the values per each sample (one triplicate per condition), here is a snapshot of how the table looks like: Snapshot

ADD REPLYlink modified 9 days ago • written 9 days ago by Bolesaem0

Oh, okay. If your only real goal is to show similarity between the two stages, something as simple as mentioning the number of differentially expressed genes between the two stages should really suffice, honestly. Venn diagrams aren't really a great construct for gene expression data, in my mind. Heatmaps generally look better and are immediately interpretable. Most people don't care about the genes that are similarly expressed between two conditions/developmental stages/whatever, the differentially expressed genes are the real meat that you should focus on in most cases.

ADD REPLYlink written 8 days ago by jared.andrews071.7k

use Venny http://bioinfogp.cnb.csic.es/tools/venny/

ADD REPLYlink written 9 days ago by Santosh Anand4.3k

or DrawVenn if you have more than 4 lists

ADD REPLYlink written 9 days ago by lieven.sterck3.3k

I have a general question regarding the type of data to be used for generation of the Venn diagram: would it make sense to use the genes identified via Gene Ontology? I have several comparisons being done AvsB and BvsA, or AvsC and CvsA, and BvsC and CvsB.

Could the gene list from these comparisons be used to check how similar samples A, B, and C are to each other?

Thank you very much!

ADD REPLYlink written 7 days ago by Bolesaem0
1

Sounds to me like this is sufficiently different from your original question, so you may want to open a new thread for this.

ADD REPLYlink written 7 days ago by WouterDeCoster35k
1
gravatar for lieven.sterck
9 days ago by
lieven.sterck3.3k
VIB, Ghent, Belgium
lieven.sterck3.3k wrote:

It's possible but it will require a few steps.

  • First, as you already did, do the differential expression analysis for each pairwise comparison you are interested in.
  • take from each of those lists the gene IDs you are interested in , eg. the down-regulated ones.
  • put those in a text file (or copy paste)
  • upload them to one of the online tools to draw venn diagrams

of course make sure that you use the same criteria for selecting the gene(IDs) from each DEG comparison

ADD COMMENTlink written 9 days ago by lieven.sterck3.3k

Thank you so much! This is actually what I was struggling with (again, I have really basic questions...).

About the criteria for selecting gene IDs, I think this is key in order to have useful information. I would like to understand how similar the different samples are to each other. It's about developmental biology, so I'm trying to understand how each differentiation stage differs from the other ones.

Could a criteria simply be to take the highest 300 expressed genes in each set? Or is it too naive? I also have some gene ontology comparisons. Is there a way to take the genes that are represented in the most relevant GOs entries and use them to see whether are commonly expressed?

ADD REPLYlink written 8 days ago by Bolesaem0

yes, that is a criteria but perhaps not the best one? Personally I would use a significance threshold, something like all genes p <0.05 or something, but I assume you will likely find more and better advice on this specific topic from people within this field.

ADD REPLYlink modified 8 days ago • written 8 days ago by lieven.sterck3.3k
1
gravatar for Charles Warden
9 days ago by
Charles Warden5.8k
Duarte, CA
Charles Warden5.8k wrote:

If you are talking about overlapping differentially expressed genes, you could try Vennerable or VennDiagram.

If you have more than 3 or 4 comparisons, finding the same method to use for all of the comparisons may be difficult (although not necessarily impossible, or at least clear differences may be OK if you pick a favorite method). You may want to see if you can set up of some sort of multi-variate comparison with more samples; or, occasionally, what needs to be done is have different methods for each comparison, even within one paper.

That said, differential expression is a little different than "commonly expressed." For example, you may have a gene with high expression in all samples. In microarrays, you could have some sort of background signal. However, even with differential expression, there will probably be some false negatives of genes that don't overlap (which is why I think you need to take time to critically assess your data, ideally trying to find some sort of new question to ask and address, and then try to determine how best a representative strategy that fairly represents your overall conclusions).

ADD COMMENTlink written 9 days ago by Charles Warden5.8k

What do you exactly mean with finding the "same method"?

Well the idea would be to identify a footprint or a molecular signature for each condition, and then compare it between each other to understand how similar they are.

ADD REPLYlink written 8 days ago by Bolesaem0
1

In general, I would recommend testing at least edgeR, limma-voom, and DESeq2 for differential expression.

ADD REPLYlink written 8 days ago by Charles Warden5.8k

OK! Yes all samples were analyzed with DESeq2 prior to the analysis!

ADD REPLYlink written 7 days ago by Bolesaem0
0
gravatar for Bolesaem
7 days ago by
Bolesaem0
Bolesaem0 wrote:

I have a general question regarding the type of data to be used for generation of the Venn diagram: would it make sense to use the genes identified via Gene Ontology? I have several comparisons being done AvsB and BvsA, or AvsC and CvsA, and BvsC and CvsB.

Could the gene list from these comparisons be used to check how similar samples A, B, and C are to each other?

Thank you very much!

ADD COMMENTlink written 7 days ago by Bolesaem0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1698 users visited in the last hour