I am relatively new to RNAseq, and hope can get some help from more experienced people here.
So I used cuffdiff to look at differential gene expression. I have 2 conditions and 3 replicates for each condition. When I look at the genes_read_group_tracking file, I found that for some genes, one of the replicates has very large FPKM values that doesn't match raw frags count.
For example this is what I see:
tracking_id condition replicate raw_frags internal_scaled_frags external_scaled_frags FPKM effective_length status
- XLOC_009487 WT 0 4 4.21548 4.21548 1150.43 - OK
- XLOC_009487 WT 1 5 5.39083 5.39083 1.14629 - OK
- XLOC_009487 WT 2 8 7.56804 7.56804 1.60234 - OK
- XLOC_009487 OE 1 2 2.29124 2.29124 0.476229 - OK
- XLOC_009487 OE 0 5 4.84785 4.84785 0.995213 - OK
- XLOC_009487 OE 2 5 4.07315 4.07315 0.77989 - OK
You can see that the FPKM for WT 0 is 1150 where as the raw frags is only 4. The other samples are fine. I observed this in multiple genes, and they don't always happen to the same sample. I also check the FPKMs from cufflinks, and they look normal. So seems that it's cuffdiff's problem.
Does anyone know why this happen and how to solve it? Appreciate the help!