Cufflinks - Assigning Transcripts Or Exons
2
2
Entering edit mode
11.7 years ago
AW ▴ 350

I have mapped paired end reads to my reference genome using TopHat and then obtained FPKM values through Cufflinks. I ran cufflinks without a gtf file as I am doubtful about its quality.

I was wondering how transcripts and exons were assigned in the transcripts.gtf output? Without a gtf file how are exons grouped together into transcripts? (see below)

Additionally, these transcripts are present in the genes.fpkm_tracking file but without length or coverage information. What does this mean about these transcripts?

gi|321228270|ref|NW_003456354.1|    Cufflinks    transcript    47231    73196    808    -    .    gene_id "CUFF.7"; transcript_id "CUFF.7.1"; FPKM "10.0383916498"; frac "0.494670"; conf_lo "9.300440"; conf_hi "10.776343"; cov "33.488707";
gi|321228270|ref|NW_003456354.1|    Cufflinks    exon    47231    48414    808    -    .    gene_id "CUFF.7"; transcript_id "CUFF.7.1"; exon_number "1"; FPKM "10.0383916498"; frac "0.494670"; conf_lo "9.300440"; conf_hi "10.776343"; cov "33.488707";
gi|321228270|ref|NW_003456354.1|    Cufflinks    exon    48894    49044    808    -    .    gene_id "CUFF.7"; transcript_id "CUFF.7.1"; exon_number "2"; FPKM "10.0383916498"; frac "0.494670"; conf_lo "9.300440"; conf_hi "10.776343"; cov "33.488707";
gi|321228270|ref|NW_003456354.1|    Cufflinks    exon    49688    49882    808    -    .    gene_id "CUFF.7"; transcript_id "CUFF.7.1"; exon_number "3"; FPKM "10.0383916498"; frac "0.494670"; conf_lo "9.300440"; conf_hi "10.776343"; cov "33.488707";
gi|321228270|ref|NW_003456354.1|    Cufflinks    exon    50723    50852    808    -    .    gene_id "CUFF.7"; transcript_id "CUFF.7.1"; exon_number "4"; FPKM "10.0383916498"; frac "0.494670"; conf_lo "9.300440"; conf_hi "10.776343"; cov "33.488707";

Any help would be much appreciated,

Thanks

cufflinks rna-seq fpkm • 5.1k views
ADD COMMENT
0
Entering edit mode
11.7 years ago

The first line represents a transcript, and the next four are exons (look at column 3). The fact that the entries have the same transcript_id attribute tells you that these exons belong to the transcript represented by the first line. Without a GTF file, Cufflinks tries to infer where the exons start and end and how to map exons to transcripts.

ADD COMMENT
0
Entering edit mode
11.7 years ago
JC 13k

First, Cufflinks without a GTF file works trying to infer the transcripts in the sample using a coverage search and local assembly, reads with splice evidence or information for paired-ends reads are used to infer the exons.

The information for length and coverage for each transcript is wrote in the isoforms.fpkm_tracking file.

ADD COMMENT

Login before adding your answer.

Traffic: 1945 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6