Trinity assembly file
0
0
Entering edit mode
4.9 years ago

Dear all,

I meet two problems when deal with the Trinity assembled fasta file... After trinity assembly, I got an fasta file, but in order to get the unigene, do I need to use "cd-hit" or "get_longest_isoform_seq_per_trinity_gene.pl" to deal with this fasta file? I am not sure which one should I use, or if I need to use both of them? I searched the internet, the priciniple of this two method seems not big difference...

Another question is that I tried using "get_longest_isoform_seq_per_trinity_gene.pl" to deal with the fasta file, the output sequence has a pretty long ID number (after">"), like the below: Do we have any script or software can help deal with this and only reserve " >TRINITY_DN123217_c2_g1_i6" ? Thanks in advance for any of your suggestions!

>TRINITY_DN123217_c2_g1_i6 len=17269 path=[17247:0-35 17260:36-164 17389:165-272 17497:273-768 17993:769-968 41567:969-992 18217:993-1157 18382:1158-1163 18388:1164-1172 41570:1173-1196 18421:1197-2360 19585:2361-2423 19648:2424-2428 19653:2429-3104 20329:3105-3164 20389:3165-3318 20543:3319-3334 20559:3335-3412 20637:3413-4064 21289:4065-4999 41566:5000-5023 22248:5024-5562 41571:5563-5586 22811:5587-6002 23227:6003-6249 23474:6250-6266 23491:6267-6587 23812:6588-6730 23955:6731-7264 24489:7265-7270 24495:7271-7453 24678:7454-8207 25432:8208-8318 25543:8319-8353 25578:8354-8449 25674:8450-8695 25920:8696-9492 26717:9493-9509 41568:9510-9533 26758:9534-9614 26839:9615-10127 27352:10128-10255 27480:10256-10577 27802:10578-10599 27824:10600-10977 41569:10978-11001 28226:11002-11010 28235:11011-11074 28299:11075-12083 29308:12084-13818 31043:13819-13911 31136:13912-14015 31240:14016-14632 31857:14633-14648 31873:14649-15575 32800:15576-15608 32833:15609-15609 32834:15610-15724 32949:15725-15752 32977:15753-16182 33407:16183-16199 33424:16200-17268] [-1, 17247, 17260, 17389, 17497, 17993, 41567, 18217, 18382, 18388, 41570, 18421, 19585, 19648, 19653, 20329, 20389, 20543, 20559, 20637, 21289, 41566, 22248, 41571, 22811, 23227, 23474, 23491, 23812, 23955, 24489, 24495, 24678, 25432, 25543, 25578, 25674, 25920, 26717, 41568, 26758, 26839, 27352, 27480, 27802, 27824, 41569, 28226, 28235, 28299, 29308, 31043, 31136, 31240, 31857, 31873, 32800, 32833, 32834, 32949, 32977, 33407, 33424, -2]
RNA-Seq Assembly sequencing • 813 views
ADD COMMENT
0
Entering edit mode

For the second question:

awk '{ print $1; }' Trinity.fasta
ADD REPLY
0
Entering edit mode

It worked !! Thank you!!

ADD REPLY

Login before adding your answer.

Traffic: 1526 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6