Question: Processing of assembled transcriptome
Dear all

I am performing some steps after assembly of transcriptome with Trinity with the idea of obtaining the best quality possible of the transcriptome.

I searched for vector contamination with VecScreen, removed the redundancy with CD-HIT and BlastClust, and removed chimeras with CD-HIT-DUP and Vsearch.

Now, I am interested in the scaffolding of fragmented transcripts.Here is when my doubts arise. Which is the difference between combine overlapping transcript contigs into single longer contigs and scaffolding? Is the same? which tools can I used to obtain better transcripts and work out the problem of fragmentation?

In the paper of FRAMA the used TGICL to combine overlapping transcripts and MAFFT fos scaffolding. But I see that the software are very different.

Thank you so much


From the remaining isoforms, check and see how many full length transcripts you have.

I did so and found around 2000 transcripts that have a hit with less than 50%.

What can I conclude from that result?

Should I discard those transcripts?

