How To Find The Total Number Of Reads Used By Trinity Assembler To Perform Assembly
Entering edit mode
10.1 years ago
bambus0725 ▴ 50


I am working on Meta-transcriptomics data.I want to assemble the data (in fasta format), for that I have chosen to use de-novo transcriptome assembler "Trinity" followed by mapping the reads with "Bowtie2".

I used the following parameters with Trinity

perl \
  --seqType fa \
  --JM 10G \
  --single sample.fasta \
  --SS_lib_type F \
  --output trinity_output.sam \
  --CPU 10 \
  --min_contig_length 20 \
  --inchworm_cpu 10 \
  --max_reads_per_graph 100000 \
  --bflyCPU 10 \
  --min_per_id_same_path 97

with Bowtie2

bowtie2 \
  -f \
  -U cluster.fasta \
  -L 20 \
  -a \
  -t \
  --un unpaired.fasta \
  --met-file metrics.txt \
  -p 10 \
  --reorder \
  >output.sam \
  -x reference \
  -N 1 \
  -L 20

I am not sure where the problem exits either while running Trinity(with parameters chosen) or Bowtie2,because at the end after the steps of assembly and mapping it could align only 30%.

It would be a great help if anyone could point out the reason for this and help me out with the solution.

Thank you in advance!

trinity RNA-seq • 5.1k views
Entering edit mode

Aside from that bowtie2 command not appearing to be valid, you're also assembling with one set and aligning with another. If these are both random subsets of the original batch of sequences then this would seem to be fine, but otherwise might be the source of the problem.

Entering edit mode

Thank you for the reply dpryan.

Yes,I used the appropriate file to align with I mean, the output file generated by Trinity as the reference file(after indexing)in Bowtie and aligned it to the set of sequences(i.e cluster.fasta)the original file even before clustering the data by CD-Hit_EST and named as "sample.fasta" which is used for assembling.

I hope I made it clear and could you suggest me what can be changed with Bowtie2 command.

Entering edit mode

Yeah, that makes sense then. Perhaps you didn't subsample enough of the reads with CD-Hit_Est (I've never used it). Regardless, a more normally formatted (and likely to work) bowtie2 command would be:

bowtie2 -f -N 1 -L 20 -a -t --un unpaired.fa --met-file metrics.txt -p 10 --reorder -x reference -U cluster.fa -S output.sam
Entering edit mode

Okay.So do you mean the order of chosing the arguments is important?But,still I got the same result.

Entering edit mode

Your original command wouldn't have actually worked at all, I can only assume that it's not what you actually typed. Aside from that, bowtie2 is only mapping that few reads because only that percentage is mappable (at least to individual contigs). My guess would be that the assembly just isn't that good. Perhaps try a different assembler or just try assembling the unmapped reads. You might also try mapping with bwa mem to see if a number of the remaining reads map across contigs.


Login before adding your answer.

Traffic: 2764 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6