Question: Differentially expressed genes
gravatar for bioinfo
4.9 years ago by
bioinfo60 wrote:


While i am running GEO datasets for differentially expressed genes with LIMMA package, I am getting duplicate genes with different probe id and p-value. Which one to consider to go for further analysis??

Thanks in advance.

gene • 1.7k views
ADD COMMENTlink modified 4.9 years ago by seidel7.1k • written 4.9 years ago by bioinfo60
gravatar for seidel
4.9 years ago by
United States
seidel7.1k wrote:

> Which one to consider to go for further analysis?

Answer: the important ones. If you have different probe IDs for the same gene, then you have different probe sequences, and the only way to figure out what's happening is to look at the data more carefully. You'll have to figure out why you get different signals at each probe. The best way to do this is to understand how well the probe represents the gene. If you are using affymetrix array data, you can examine the probe set names (the affy ids) to see what kinds of probes they are - there is a hierarchy of things to pay attention to, which may provide an easy filtering step. The probe suffix tells you what class of probe it is. Usually all probe ids end in _at (e.g. 1769336_at), which means they detect the antisense sequence of the transcript. These probes detect their design sequence (aka the examplar sequence) uniquely and do not cross-hybridize to other design sequences. However, there are also probe sets with a penultimate letter code: _a_at, _s_at, _x_at (e.g. 1769349_s_at). The letter denotes that the probe set exhibits different kinds of cross-hybridization. The "a" designation indicates that the probe set detects a gene family, the "s" designation denotes cross-hybridization to different design sequences that are not part of the same gene family, and the "x" designation is the messiest, with various probe sequences cross hybridizing to many different design sequences (remember that an affy probe set is a "set" of at least 11 small probes - and this "set" has a single ID that appears in your results). Here's a diagram from affymetrix that might help you.

So if you want to know why different probe IDs give different p-values, you need to understand how well the underlying sequence represents the gene you think it does. The unique probes sets are the easiest to pay attention to.


ADD COMMENTlink modified 4.9 years ago • written 4.9 years ago by seidel7.1k
gravatar for EVR
4.9 years ago by
EVR570 wrote:


Consider the p values as the standard criteria  for finding the significant genes. In case duplicates henes you can consider the p.values

ADD COMMENTlink written 4.9 years ago by EVR570
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1006 users visited in the last hour