Question: How to generate a Venn diagram
0
minni923440 wrote:

I'm new to R. My question is I have 2 gene lists, (one with unregulated genes, another one is process with GO ids and genes) and Im wondering how to generate a venn diagram and see which genes that are overlap?

venn R • 8.7k views
modified 5.1 years ago by Prakki Rama2.3k • written 5.1 years ago by minni923440
4
Alex Reynolds28k wrote:

You don't need R, just two lists of genes, one list for unregulated and one list for your other category. You can paste each set into BioVenn to make a roughly proportional circular Venn diagram. If you want a list of overlaps, just use `grep` between the two lists:

`\$ grep -fF listA listB > answer`

And BioVenn will also give a list of the overlap. Just click on one of the links below the diagram (e.g. x-y total overlap).

1
Zhilong Jia1.4k wrote:

R package: VennDiagram

1
seidel6.8k wrote:

Here are three simple methods using R. Thomas Girke has a lot of useful R stuff. FInd it and use it.

`# generate two random lists`

`a <- sample(LETTERS,20)`

`b <- sample(LETTERS,20)`

`# grab Thomas Girke's Venn Diagram functions`

`source("http://faculty.ucr.edu/~tgirke/Documents/R_BioCond/My_R_Scripts/vennDia.R")`

`# draw a diagram`

`qlist <- venndiagram(x=a, y=b, unique=T, title="2-Way Venn Diagram", labels=c("Sample1", "Sample2") `

`                     ,plot=T, lines=c(2,3), lcol=c(2,3), lwd=3, cex=1.3, type="2")`

You'll find that qlist is a list containing the elements of each Venn diagram region. e.g.

```> qlist\$q1   "M" "J" "F" "U" "O" "K" "S" "T" "G" "Q" "Y" "Z" "L" "W" "A"```

However, if you just need simple counts, you can use the %in% operator:

`# count the common elements of a and b`

`sum(a %in% b)`

`# get the elements themselves`

`a[a %in% b]`

`# or use other functions for sets`

`intersect(a,b)`

`length(intersect(a,b))`

Note: if you examine the code at the URL above (just load it into your browser) he says he has replaced that Venn Diagram function with something better. Might be worth examining.

1

Note that this does not draw a proportional diagram.

1
Prakki Rama2.3k wrote:

Check this Biostar post Tool To Generate Proportional Venn Diagrams? for more info.

0
minni923440 wrote:

I have more than 1000000 genes in one gene list, and the tools don't allow me to perform that much. What should I do?

Do element counts of each set category:

1. A-only
2. B-only
3. A ∩ B

Turn these counts into smaller values and make dummy lists.

For instance, say you have 1M elements unique to set A, 500k elements unique to set B, and 300k elements unique to A ∩ B.

Divide all the element count values by 10k. Then you have 100 new "elements" in set A, 50 new "elements" in set B, and 30 "elements" in set A ∩ B.

Now make three new lists with these element counts.

If you want to do element counts, the simplest way is to use the command-line tool `wc` to do line counts, as in:

`\$ echo "foo has `wc -l foo | awk '{ print \$1; }'` lines in it"`
0
minni923440 wrote:

Can you clarify more? how do I perform in R?