Count, LT, PH, PT Column in Functional Annotation Chart of DAVID
1
0
Entering edit mode
3.3 years ago
nhaus ▴ 300

Hello everybody,

I am new to GSEA analysis and i use DAVID for the analysis, so please excuse if these are basic questions. I am currently analyzing WES data, and for that I compare my gene lists to different custom backgrounds. When I looked at the results of the Functional annotation chart I noticed something that I cannot explain in the Count, LT, PH, PT Columns.

It seems like the a gene in my gene list is only counted in the "Count" column, if it also appears in the background. Thus the more background genes I include, the bigger the "Count" gets and the more significant the enrichment is. Is this intentional? And if yes, why? I thought that the background shouldn't play a role in determining what genes of my gene list are in a particular Pathway or GO term.

Furthermore, I noticed that the PT (population total) and LT (List total) are different for each Term. I also have no idea why this is the case, since the number of genes in my gene list and my background shouldn't change.

If somebody could help my understand these observations, I'd be very grateful!

Cheers!

DAVID GSEA • 975 views
ADD COMMENT
2
Entering edit mode
3.3 years ago
Asaf 10k
  1. The background is the group of all the genes in the genome. If you entered genes which are not in the genome DAVID will ignore them and they will be dropped from the list.
  2. For each analysis the background might be different. If the mapping of gene <-> GO term was done on a set of genes different than the background then the background collection of genes will be the intersect of the "background" list and the list of genes relevant for the analysis , this is why the PT can change and also LT, if not all the genes in the input list are mapped.
ADD COMMENT
0
Entering edit mode

Thank you for your answer!

The background is the group of all the genes in the genome. If you entered genes which are not in the genome DAVID will ignore them and they will be dropped from the list.

This makes sense. But in this case I have a follow up question: What would be the best way to compare two gene lists? E.g. gene list 1 is treated, gene list 2 is untreated and i want to see if there are different enrichments (both gene lists are unranked). Right now, I was using gene list 1 as an input and gene list 2 as a background, but based on your answer this would be the wrong approach if I am not mistaken.

For each analysis the background might be different. If the mapping of gene <-> GO term was done on a set of genes different than the background then the background collection of genes will be the intersect of the "background" list and the list of genes relevant for the analysis , this is why the PT can change and also LT, if not all the genes in the input list are mapped.

Im afraid I cant follow you here. Why would the background be different for the same analysis? Only difference is the GO Category.

ADD REPLY
1
Entering edit mode

The easy way to compare two gene lists is to compare list1 vs union(list1, list2) so the background should be both lists so the Fisher's exact test is ((list1 & GO), (list1 - GO), (all-list1& GO, all-list1 - GO)). Just watch out for overlap of the lists and how to interpret the results in this case.

As for the second point, GO mapping was done on all the genes in the genome but there might be a case that another mapping was done only for a subset of the genes so the genes in the background that weren't mapped will be removed from the background for this specific test.

ADD REPLY

Login before adding your answer.

Traffic: 2469 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6