Question: How to create a Venn Diagram from data frame and get the list of common Genes for combination?
4
gravatar for Wox
16 months ago by
Wox440
HUJI
Wox440 wrote:

How to create a Venn Diagram from data frame and get the list of common Genes expressed in each combination?

I have a data frame of DEG, How can I create a Venn Diagram from data frame and get the list of common Genes?

Note: data frame has NA for some genotypes for some Genes, If all the genotypes are NA for a particular gene, that raw should be ignored.

Gene    Genotype1   Genotype2   Genotype3   Genotype4
    AT1G17400   NA  NA  NA  NA
    AT1G09420   NA  0.000800188 0.000116452 0.004017191
    AT1G50930   NA  NA  NA  NA
    AT1G65960   NA  NA  NA  NA
    AT1G09400   NA  NA  NA  NA
    AT1G09415   NA  NA  NA  NA
    AT1G74730   NA  NA  NA  NA
    AT1G75100   0.001639398 0.001578892 6.92E-05    NA
    AT1G75100   0.001639398 0.001578892 6.92E-05    NA
    AT1G75240   NA  5.60E-05    0.000235329 0.000162115
    AT1G14920   NA  NA  NA  NA
    AT1G14920   NA  NA  NA  NA
    AT1G65510   NA  NA  NA  NA
    AT1G75250   NA  NA  NA  NA
    AT1G54410   NA  0.000113869 1.25E-05    NA
rna-seq R • 2.4k views
ADD COMMENTlink modified 16 months ago by dsull1.8k • written 16 months ago by Wox440
4
gravatar for dsull
16 months ago by
dsull1.8k
UCLA
dsull1.8k wrote:

Question is very vague. What do these numbers mean? What do you consider expressed? Is anything that is not NA considered expressed?

Take a look at the Venn function here: https://www.rdocumentation.org/packages/gplots/versions/3.0.1.1/topics/venn

It gives you an example on how to create Venn diagrams.

I'll go off the assumption that everything non-NA is considered expressed. Say you have everything stored in a dataframe named data. You need to supply lists of non-NA genes that belong to each of your four groups: Genotype1, Genotype2, Genotype3, Genotype4:

require(gplots)
Genotype1 <- data[!is.na(data$Genotype1),"Gene"]
Genotype2 <- data[!is.na(data$Genotype2),"Gene"]
Genotype3 <- data[!is.na(data$Genotype3),"Gene"]
Genotype4 <- data[!is.na(data$Genotype4),"Gene"]
input <- list(Genotype1=Genotype1, Genotype2=Genotype2, Genotype3=Genotype3, Genotype4=Genotype4)
venn(input)
ADD COMMENTlink modified 16 months ago • written 16 months ago by dsull1.8k
1

We could improve data preparation for venn as:

input <- lapply(data[ -1 ], function(i) unique(data$Gene[ is.na(i) ]))
ADD REPLYlink written 15 months ago by zx87549.9k

This is the answer I want :) Thank you dsull. BTW, How can I find what gene IDs went to each common category? How Can I export them out?

ADD REPLYlink written 16 months ago by Wox440
2

Well, if you want to know what genes belong to a certain category, say Genotype 1, you can simply print them out via:

print(Genotype1)

If you want to quickly see something like: genes that belong to Genotype1 and Genotype3 but don't belong to Genotype2 and Genotype4, again, you can subset your dataframe as follows: data[!is.na(data$Genotype1) & !is.na(data$Genotype3) & is.na(data$Genotype2) & is.na(data$Genotype4),"Gene"]

Basically, the exclamation point is the negation symbol so !is.na means the genes that are not-NA whereas is.na means the genes that are NA. The ampersand (&) means AND. Look into subsetting dataframes in R for more details.

ADD REPLYlink written 15 months ago by dsull1.8k

Thanks a heap, dsull :) Appreciate.

ADD REPLYlink written 15 months ago by Wox440
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1173 users visited in the last hour
_