gffread cannot get the gene symbol name for each transcript
1
1
Entering edit mode
8.0 years ago
super ▴ 60

Hi All

When I run gffread to get transcripts from Tophat output

cufflinks -p 8 -u -g reference_genome_genes.gtf  -o outdir accepted_hits.bam
cd outdir
gffread -w transcripts.fa -g reference_genome.fa transcripts.gtf

I got the transcripts.fa.

>CUFF.5657.1 gene=CUFF.5657
>ENSGALT00000015891 gene=CUFF.5841
>CUFF.5841.1 gene=CUFF.5841
>CUFF.5844.1 gene=CUFF.5844
>CUFF.5848.1 gene=CUFF.5848
>CUFF.5841.3 gene=CUFF.5841
>CUFF.5851.1 gene=CUFF.5851
>ENSGALT00000015914 gene=CUFF.5729
>ENSGALT00000015903 gene=ENSGALG00000009778
>ENSGALT00000015896 gene=ENSGALG00000009775

If I open the transcripts.fa, the format is :

  $head transcripts.gtf
    10      Cufflinks       transcript      13726   13952   1000    -       .       gene_id "CUFF.1"; transcript_id "CUFF.1.1"; FPKM "2.7667603544"; frac "1.000000"; conf_lo "1.287865"; conf_hi "4.245656"; cov "15.974601"; full_read_support "yes";
    10      Cufflinks       exon    13726   13952   1000    -       .       gene_id "CUFF.1"; transcript_id "CUFF.1.1"; exon_number "1"; FPKM "2.7667603544"; frac "1.000000"; conf_lo "1.287865"; conf_hi "4.245656"; cov "15.974601";
    10      Cufflinks       transcript      14653   14823   1000    +       .       gene_id "CUFF.2"; transcript_id "CUFF.2.1"; FPKM "11.2588291433"; frac "1.000000"; conf_lo "7.992090"; conf_hi "14.525568"; cov "71.347614"; full_read_support "yes";

However, lots of identifiers are CUFF* and the gene= is Ensembl ID. I know we could use biomart to translate it (e.g. ENSGALG00000009778) into gene symbol.

(1) any other option could directly get the gene= gene symbol (e.g. MX2, RSAD2) instead?

(2) What is CUFF**? Is that novel gene/transcripts founded by Cufflinks ?

Thanks!

RNA-Seq cufflinks genome transcriptome • 2.0k views
ADD COMMENT
0
Entering edit mode
8.0 years ago
super ▴ 60

anybody knows the answer? Thank you

ADD COMMENT

Login before adding your answer.

Traffic: 1528 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6