Question: cuffcompare: tracking file lists multiple cufflinks transcripts per transfrag
0
gravatar for acorella
2.6 years ago by
acorella30
United States
acorella30 wrote:

Hi,

I am trying ab initio transcriptome assembly on tophat aligned RNAseq data with cufflinks. I ran cufflinks and cuffcompare without a reference genome to obtain the union of all transcripts.

In the tracking file, I often see many different "CUFF" gene ids listed for a single transfrag ("TCONS"). In addition, if the "CUFF" transcript IDs are the same, they will have different lengths in different samples. This is a single line of the output:

TCONS_00000015  XLOC_000034 -   .   q1:CUFF.33|CUFF.33.1|100|4.126633|3.879204|4.374062|14.800053|7595  q2:CUFF.39|CUFF.39.1|100|5.006652|4.725912|5.287393|18.373256|7021  q3:CUFF.38|CUFF.38.1|100|17.555915|16.559479|18.552350|58.809951|2233,CUFF.37|CUFF.37.2|100|0.695648|0.507680|0.883617|2.299228|3360    q4:CUFF.48|CUFF.48.1|100|6.568168|5.949411|7.186924|21.635958|2236,CUFF.46|CUFF.46.1|100|1.057953|0.745443|1.370464|3.535247|1455,CUFF.47|CUFF.47.1|100|0.875080|0.701251|1.048909|2.838155|3642    q5:CUFF.46|CUFF.46.1|100|19.644389|18.552479|20.736298|61.979307|2244,CUFF.45|CUFF.45.1|100|1.450014|1.201338|1.698689|4.501860|3101,CUFF.44|CUFF.44.1|100|0.725197|0.474165|0.976229|2.209039|1629 q6:CUFF.45|CUFF.45.1|100|6.836163|6.522581|7.149745|25.820700|7446

My questions are:

Why are CUFF33, CUFF39, CUFF38, CUFF48, etc all listed under the same transfrag (TCONS_00000015)?

Why does CUFF.46.1 have length 1455 in q4 but 2244 in q5? Are they different isoforms? Or is this the total bp mapping to that transcript in each sample?

If a transcript length is indicated by "-", what does this mean?

I appreciate any help!

rna-seq cuffcompare • 833 views
ADD COMMENTlink modified 2.5 years ago by Biostar ♦♦ 20 • written 2.6 years ago by acorella30

Did you run cufflink for multiple samples and then merge assembly using cuffmerge ? If yes then this could happen. Cuffmerge will merge the assembly and give you comprehensive transcriptome. The output you have showed here says that which are the genes got merged and then one final representative transcript created which is TCONS_00000015. Your sample details could be helpful to understand whole scenario.

~C.

ADD REPLYlink written 2.5 years ago by Chirag Parsania1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1416 users visited in the last hour