Cuffdiff - Transcripts Have Experssion Values Of Zero, "Nan" Or "Inf"
0
2
Entering edit mode
10.6 years ago
lhusselmann ▴ 20

I'm experiencing the following problem using the pre-compiled binary packages of Tophat-2.0.9 and Cufflinks-2.1.1 where Cuffdiff was reporting many genes and transcripts as having expression levels of zero or "nan" or "inf". Any suggestions as to what I should do?

test_id    gene_id    gene    locus    sample_1    sample_2    status    value_1    value_2    log2(fold_change)    test_stat    p_value    q_value    significant
XLOC_008530    XLOC_008530    -    MDC006635.211:38544-38620    C0    E0    OK    30221.4    1.11591e+07    8.52844    903.902    5e-05    0.0176643    yes
XLOC_011890    XLOC_011890    -    MDC009345.248:7633-8811    C0    E0    OK    0    273.11    inf    nan    5e-05    0.0176643    yes
XLOC_014073    XLOC_014073    -    MDC010817.271:25183-25478    C0    E0    OK    0    140.21    inf    nan    0.00015    0.0417085    yes
XLOC_019038    XLOC_019038    -    MDC015012.71:1213-1481    C0    E0    OK    0    178.129    inf    nan    0.00015    0.0417085    yes
XLOC_020039    XLOC_020039    -    MDC015910.528:10447-10548    C0    E0    OK    12325.1    2.3107e+06    7.55059    378.977    5e-05    0.0176643    yes
XLOC_023582    XLOC_023582    -    MDC019007.400:2447-2753    C0    E0    OK    0    644.473    inf    nan    5e-05    0.0176643    yes
XLOC_024891    XLOC_024891    -    MDC020182.198:2666-2981    C0    E0    OK    0    171.57    inf    nan    5e-05    0.0176643    yes
XLOC_025035    XLOC_025035    -    MDC020310.146:5206-5628    C0    E0    OK    0    53.5601    inf    nan    0.0001    0.0322248    yes
XLOC_025460    XLOC_025460    -    MDC020722.137:4959-5145    C0    E0    OK    0    706.613    inf    nan    0.00015    0.0417085    yes
XLOC_026573    XLOC_026573    -    MDC021889.237:5595-5690    C0    E0    OK    9753.29    3.7682e+06    8.59377    482.263    5e-05    0.0176643    yes
XLOC_027081    XLOC_027081    -    MDC022358.373:27042-27378    C0    E0    OK    0    210.996    inf    nan    5e-05    0.0176643    yes
XLOC_000015    XLOC_000015    -    MDC000017.398:2019-2460    C0    C2    OK    0    77.279    inf    nan    5e-05    0.0176643    yes
XLOC_000080    XLOC_000080    -    MDC000071.210:1371-1990    C0    C2    OK    0    782.288    inf    nan    5e-05    0.0176643    yes
XLOC_000924    XLOC_000924    -    MDC000691.163:12249-13292    C0    C2    OK    0    26.0045    inf    nan    5e-05    0.0176643    yes
XLOC_001030    XLOC_001030    -    MDC000760.256:22135-23927    C0    C2    OK    0    47.105    inf    nan    5e-05    0.0176643    yes
XLOC_001274    XLOC_001274    -    MDC000953.460:2713-3264    C0    C2    OK    0    422.783    inf    nan    5e-05    0.0176643    yes
XLOC_001440    XLOC_001440    -    MDC001075.160:2046-6158    C0    C2    OK    0    21.3232    inf    nan    5e-05    0.0176643    yes
XLOC_001506    XLOC_001506    -    MDC001128.127:17389-19697    C0    C2    OK    0    29.9881    inf    nan    5e-05    0.0176643    yes
XLOC_001700    XLOC_001700    -    MDC001307.434:8029-10455    C0    C2    OK    0    29.6674    inf    nan    5e-05    0.0176643    yes
XLOC_002080    XLOC_002080    -    MDC001577.2963:9743-10308    C0    C2    OK    0    55.0199    inf    nan    0.0002    0.0495222    yes
XLOC_002159    XLOC_002159    -    MDC001635.618:38277-40655    C0    C2    OK    0    548.554    inf    nan    5e-05    0.0176643    yes
XLOC_002576    XLOC_002576    -    MDC001927.172:1937-4253    C0    C2    OK    0    9.30383    inf    nan    0.0001    0.0322248    yes
XLOC_002715    XLOC_002715    -    MDC002007.156:3230-4772    C0    C2    OK    0    20.7091    inf    nan    0.0001    0.0322248    yes
XLOC_002856    XLOC_002856    -    MDC002113.246:200-1028    C0    C2    OK    0    39.9545    inf    nan    5e-05    0.0176643    yes
XLOC_002872    XLOC_002872    -    MDC002121.323:4512-5988    C0    C2    OK    0    22.4906    inf    nan    0.0001    0.0322248    yes
XLOC_003047    XLOC_003047    -    MDC002235.543:23412-24616    C0    C2    OK    0    23.2862    inf    nan    5e-05    0.0176643    yes
XLOC_003153    XLOC_003153    -    MDC002325.383:3175-4516    C0    C2    OK    0    22.8465    inf    nan    5e-05    0.0176643    yes
XLOC_003453    XLOC_003453    -    MDC002536.231:5046-9852    C0    C2    OK    0    195.692    inf    nan    5e-05    0.0176643    yes
XLOC_004230    XLOC_004230    -    MDC003196.304:8022-10369    C0    C2    OK    0    41.8157    inf    nan    5e-05    0.0176643    yes
cufflinks cuffdiff • 4.7k views
ADD COMMENT
0
Entering edit mode

Could you (or one of the editors) fix this so it has the right format?

BTW, there will be plenty of genes and transcripts with no expression. Do expect those to be otherwise? In general, if you have 0 expression in one group of samples and any expression at all in another, the foldchange will be infinite and you'll get results like these.

ADD REPLY
0
Entering edit mode

Hi dpryan Thanks for the reply. The output is from cuffdiff, the gene_exp.diff file. I put the same question on seqanswers http://seqanswers.com/forums/showthread.php?t=33733&highlight=Lizex . Cufflinks editors did respond at all to the question I send them. From the entire experiment i.e. five time points (control and treatment) I got 3645 significant transcripts (yes in gene_exp.diff file). From these only 34 i.e. 0.9% have an expression value. I agree with you that there will be genes and transcripts with no expression but to see so 0.9% of transcripts with an expression value from an entire experiment, is that normal. How do I address this situation where I have 0 expression in one group and expression in another group?

ADD REPLY
0
Entering edit mode

That does sound a bit off. I actually wonder if some of your libraries are just crap (it happens). It'd be helpful to know the actual nature of the experiment, as 3645 DE genes could either be a reasonable number or way too many. Also, you might open your BAM files in IGV or another browser and just see if these calls seem reasonable given the data you have.

ADD REPLY
0
Entering edit mode

Thanks I'll look into these.

ADD REPLY
0
Entering edit mode

If you are looking for why you are getting those values I can't help but I can explain you what does "nan" and "inf" mean? inf - infinite and you will only see it in the column representing fold change. If the denominator sample has zero expression for the gene then the fold change will be inf. nan- "not a number" tag will appear in test statistics column because the test statistics was either infinity or -infinity or something that was not a number. i dont know how they calculate the test statistics and what are the possible values for it.

EDITED -to make it more clear

ADD REPLY
0
Entering edit mode

Thanks for the reply.

ADD REPLY
0
Entering edit mode

There were a few typos so i edited it. Nth substantial.

ADD REPLY

Login before adding your answer.

Traffic: 1619 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6