Question: What is the fate of low coverage region during Trinity assembly of transcriptome data?
MAPK wrote:

Hi All, I just submitted a manuscript for metatranscripomic analysis results based on one of the SRA study. I was reporting some of the sequences from Trinity assembly for this study. I just got some comments from the reviewers saying that I missed to report some additional contigs. One of the reviewers was able to find additional contigs that I was not able to find in Trinity assembled database. I then alligned the fastq files (paired end) to the contigs reviewers found. I found that it has less than three hundred reads aligned. I could not find those contigs in my Trinity assembled database, but I was still able to align fastq reads (albeit less than 300 reads) to those contigs. Therefore, I was wondering if Trinity has failed to report those contigs with low coverage. Does Trinity drop the contigs with low coverge during de-novo assembly? Would anyone please help me understand why it was not reported by Trinity so I could write my rebuttal to them?

I think it's indeed likely that Trinity drops out low coverage contig during assembly because it will consider it as 'noise' .

I wonder though, where the reviewer got those contigs from? Did he assemble the data himself?

Yes, the reviewer assembled the data and probably using different assembly tool.

Some suggestions:

  • Ask him about his methods, otherwise you can't say much about the differences found.

  • Filter the reads that map to those contigs, and map them to your assembly.

  • When you map the reads used to assemble the transcriptome, what is the mapping rate? is it 95% or above?

