I have lots RNASeq reads from numerous tissues from a bird species I'm working on. Unfortunately there are no well-annotated genomes for this bird. To get around this, we have assembled the RNA Seq reads and used blat to align them to the fasta-formatted genome (again, not well-annotated). I then converted the resulting psl files to gff files using blat2gff.
I've also taken the RNASeq reads and mapped them using Tophat with no gtf file, and generated a number of gtf files (one for each sample I have).
So - to summarize - I have gtf files from a Tophat run and GFF files from a blat alignment of assembled RNASeq reads.
My question is: How do I figure out which transcript generated by Tophat/Cufflinks in the GTF file correlates to which Transcript detailed in the GFF file?
What would be a good first step?
Thanks in advance!
Wyatt
Thanks for the help, Istvan! I've tried IRanges previously and - while I did click the button and get an answer - I'm not sure what happened.
Bedtools seems to be easy to use, so I'll look into that and report back what I find!
Thanks,
Wyatt