Question: Find Protein Id From Gff File For Results Of Cufflinks
gravatar for
5.9 years ago by
ferris.us20 wrote:

I have transcripts.gtf file from cufflinks and gff file from JGI. How can I find the protein id from gff file for each transcript in transcripts.gtf?

id gff cufflinks protein • 2.5k views
ADD COMMENTlink modified 4.2 years ago by Biostar ♦♦ 20 • written 5.9 years ago by ferris.us20

Are you saying that transcript ID appears in both files and you want to know how to match? It would help to see an example line and example IDs from each file.

ADD REPLYlink written 5.9 years ago by Neilfws48k
gravatar for Alex Reynolds
5.9 years ago by
Alex Reynolds29k
Seattle, WA USA
Alex Reynolds29k wrote:

BEDOPS gtf2bed, gff2bed and bedmap could perhaps help, if the GTF and GFF inputs follow specification:

$ gtf2bed < transcripts.gtf > transcripts.bed
$ gff2bed < proteinIds.gff > proteinIds.bed
$ bedmap --echo --echo-map-id-uniq transcripts.bed proteinIds.bed > answer.bed

The file answer.bed will contain transcript elements from the GTF file, along with a semi-colon-delimited list of unique protein IDs from the GFF file, where the GFF element overlaps the Cufflinks-sourced transcript by one or more bases.

ADD COMMENTlink modified 5.9 years ago • written 5.9 years ago by Alex Reynolds29k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1663 users visited in the last hour