Hi, I am new to RNA-seq, I am using Trinity genome guided assembly, and I could really use some help. I'd appreciate that a lot.
My pipeline is the following:
1) map my raw fastq data to mm10 using STAR.
2) feed trinity with the file STAR generated, using genome guided assembly
3) Trinity spit out assembled transcript like this :
>TRINITY_GG_1_c0_g1_i1 len=456 path=[0:0-455]
CTTCAGACTCAGTTTTTGCTTGTTTCAACTGTCCCGTATACACATCAACATGGTATCTCACCAATGGAAAAA
CAGGCTCTCCTTCTTTCATTACAGGAAGCTCACAGACAATGTCTCCATCAGCCTGGTTCCGAGAAAGACA
CACATTTGCAACAAAATGTAGGGTCTTCTTGCTCTTCACGTTTTCCATTGTCACCCTCTGTAAGGTCCACT
CTGGTTGCCCACCAGTTCCATCATGTCCTATTCTGATCTTGTATATCTCTCCAATGCCTCTTAGTACAACCT
GAAATTCATCTGTCTGTCCTGGAAGGAAGAGCTTTTCTTGGCTGTCCTTGGTAAGGCTGATTGGTCCAGT
GACACCTTCATATCCATACACCCACAATGTGACATTGGCCTGAGTACCTGTGTTT
CCAGTCACCACTAAGACCTTCCATTTTTCTTCTAACAGAAGTGTCT
4) Then I use gmap to map those reads back to reference genome, it gave me:
Alignments: Alignment for path 1:
+chr1:4147901-4147963 (1-63) 100% <- ...648... 0.994, 0.859
+chr1:4148612-4148744 (64-196) 100% <- ...15110... 0.999, 0.984
+chr1:4163855-4163941 (197-283) 100% <- ...6263... 0.996, 0.999
+chr1:4170205-4170377 (284-456) 100%
My Question GMAP didn't give me which transcript is this, but only the genomic coordinate, my goal is to find de-novo transcript which is not presented on the reference GTF file, therefore, I hope there is some tools, which I can annotate my transcript, then whichever left (those didn't get to map to a reference transcript, I can investigate them more)
This is my first time post question here, I apologize in advance if there is anything inappropriate.