Question: Choosing background list in protein set overlap analysis
gravatar for nand
4.7 years ago by
United States
nand0 wrote:

I am new to proteomics and have a question about testing for overlap between protein lists. I have 2 protein lists I would like to compare: A subset (n) of list of proteins (N) identified in my experiment and a list of proteins from literature belonging to a specific category (R). 

I would like to know whether my subset list of proteins (n) is enriched for the proteins in the list from literature (R) compared to other subsets in my experiment. I want to use the hypergeometric test for determining the significance of overlap between the 2 sets (n and R).

I am not sure what to use as the background list. From my reading, I thought of using the total proteins identified (N) in the experiment as the background, however, I realized that about 50% of the proteins in the list from literature (R) were not identified in my experiment. So obviously they would not be present in my subset list (n) in which I would like to look for overlap with the literature list (R).

Under these conditions, would it be acceptable for me to filter the literature list (R) for only those proteins that were identified in my experiment (N) and then compare my subset list (n) with the subset literature protein list (r) and then use my total proteins identified as the background?

If not, what should my background list be?

ADD COMMENTlink modified 4.6 years ago by Biostar ♦♦ 20 • written 4.7 years ago by nand0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1009 users visited in the last hour