Question

Strange pattern in cufflink output

0

Entering edit mode

10.0 years ago

Prasad ★ 1.6k

Hi,

Recently started with ref based txome. I got the human data (5 samples, ~10M PE reads). Did the tophat and cufflinks and got the transcripts for each samples. What I observed is, all the samples have exact same number of transcripts (~40k). Just to check I tool another set (4 samples) of human samples, same result but this case ~43k transcripts. Same thing happened when I tried with mouse samples. My concern is, when I worked with plant sample I got different transcript numbers. In all cases none of the samples are replicates.

Q: Is this kind of consistency is normal. If so What went wrong in plant case. (All reference data was downloaded from ensemble)

Any suggestions are appreciated.

tophat cufflinks RNA-Seq • 1.7k views

ADD COMMENT • link updated 2.6 years ago by Ram 45k • written 10.0 years ago by Prasad ★ 1.6k

0

Entering edit mode

When you say the same number of transcripts are you talking about counts/RPKMs or about actual transcripts (i.e. transcript 1, transcript 2, .....) Depending on your cufflinks protocol I believe that this is not something you should be concerned about. If you are using cuffmerge to merge the transcriptomes of each sample before cuffdiff then I believe you should get a constant number of transcripts.

If all the samples have the same expression values for all transcripts that would be something to worry about but I don't think that is what you are saying.

One final note, if you are working with a referenced organism like humans you might consider skipping cufflinks/cuffmerge. To my knowledge those steps are very focused on novel isoform discovery and isoform differential expression. If you are just looking at differential expression at the gene level you should be able to just align with tophat (passing the hg19.gtf file) and then run cuffdiff on the accepted_hits.bam files (again including the hg19.gtf file)

I could be wrong on this though, just personal experience

ADD REPLY • link updated 2.6 years ago by Ram 45k • written 10.0 years ago by mbio.kyle ▴ 380

0

Entering edit mode

Thank you. As you mentioned, it not the RPKM I was talking it is actual number of transcripts. These same number I am getting after cufflinks not after cuffmerge. I was just curious why this constant pattern only with human, mouse (model organism what I have tested) but not with the other organism. what could be the reason?

ADD REPLY • link updated 2.6 years ago by Ram 45k • written 10.0 years ago by Prasad ★ 1.6k