Question

How can I edit the output from Cufflinks to do my own normalization?

0

Entering edit mode

8.7 years ago

rodolpho.gheleri ▴ 60

Hey everyone,

I am running an experiment with 4 samples paired-end among 2 conditions (Control vs Mutation) and 2 replicates of each one (C1, C2, MUT1, MUT2).

After mapping with segemehl, I build the transcripts with Cufflinks. So, at the end I have transcripts.gtf, genes.fpkm_tracking and isoforms.fpkm_tracking. Now I have to pick the count (FPKM) of each gene and divide by a certain value corresponding the count of plasmid that was inserted in each sample and then proceed with the pipeline (cuffmerge and cuffdiff).

This values can be found in the table bellow.

Sample   Value
C1       445.188/0.296
C2       137.217/0.196
MUT1     340.072/0.143
MUT2     643.493/0.271

But how can I do that? I already tried to edit the output from cufflinks and divide the counts of the 3 files, but when I merge the transcripts, all values disappear. I can't try after runs cuffmerge because the samples are merged and I can't discriminate the samples.

Is there a way to do it?

Cufflinks RNA-Seq Normalization • 2.6k views

ADD COMMENT • link updated 18 months ago by Ram 43k • written 8.7 years ago by rodolpho.gheleri ▴ 60

0

Entering edit mode

cufflinks package also outputs estimated raw counts. You could use them to normalise again.

ADD REPLY • link 8.7 years ago by GouthamAtla 12k

0

Entering edit mode

Yes, but how can I refeed the cufflinks/cuffdif with this information? My goal is find differential expressed HOX genes.

ADD REPLY • link 8.7 years ago by rodolpho.gheleri ▴ 60

0

Entering edit mode

You don't. Cuffdiff is only designed to be used in a few predefined ways, of which what you're trying to do isn't one.

ADD REPLY • link 8.7 years ago by Devon Ryan 104k

0

Entering edit mode

Ok, I will try to use DeSeq2 with the raw counts from cuffdiff, but the values are not integers. Deseq2 can accept this kind of values?

ADD REPLY • link 8.7 years ago by rodolpho.gheleri ▴ 60

0

Entering edit mode

No, you'll either need to round them (not ideal) or instead use either edgeR or limma/voom.

ADD REPLY • link 8.7 years ago by Devon Ryan 104k

0

Entering edit mode

Either that or use something like htseq_count

ADD REPLY • link updated 19 months ago by Ram 43k • written 8.6 years ago by andrew.j.skelton73 6.5k

Ram · Answer 1 · 2015-09-12

Mucking around with data produced by one suite and putting it into the other unrelated one is a favorite past time of those that, as they say, "just want to use the tool everyone is using" - a hair raising example was someone telling me how they took FPKM values produced by Cuffdiff and wanted to use DESeq with it but because these values were too small and non integer they just ended up multiplying everything by 1000 and then "DeSeq worked" ... (bioinformatics man, everything is possible, probably published as well)

My advice if you can't use the Cuffdiff pipeline use something else that takes into account your specifics, and don't try to make it work by rescaling after the fact etc. Your rescaling will very likely be all wrong.