Question

How To Find The Total Number Of Reads Used By Trinity Assembler To Perform Assembly

1

Entering edit mode

10.6 years ago

bambus0725 ▴ 50

Hello,

I am working on Meta-transcriptomics data.I want to assemble the data (in fasta format), for that I have chosen to use de-novo transcriptome assembler "Trinity" followed by mapping the reads with "Bowtie2".

I used the following parameters with Trinity

perl Trinity.pl \
  --seqType fa \
  --JM 10G \
  --single sample.fasta \
  --SS_lib_type F \
  --output trinity_output.sam \
  --CPU 10 \
  --min_contig_length 20 \
  --inchworm_cpu 10 \
  --max_reads_per_graph 100000 \
  --bflyCPU 10 \
  --min_per_id_same_path 97

with Bowtie2

bowtie2 \
  -f \
  -U cluster.fasta \
  -L 20 \
  -a \
  -t \
  --un unpaired.fasta \
  --met-file metrics.txt \
  -p 10 \
  --reorder \
  >output.sam \
  -x reference \
  -N 1 \
  -L 20

I am not sure where the problem exits either while running Trinity(with parameters chosen) or Bowtie2,because at the end after the steps of assembly and mapping it could align only 30%.

It would be a great help if anyone could point out the reason for this and help me out with the solution.

Thank you in advance!

trinity RNA-seq • 5.4k views

ADD COMMENT • link updated 18 months ago by Ram 44k • written 10.6 years ago by bambus0725 ▴ 50

0

Entering edit mode

Aside from that bowtie2 command not appearing to be valid, you're also assembling with one set and aligning with another. If these are both random subsets of the original batch of sequences then this would seem to be fine, but otherwise might be the source of the problem.

ADD REPLY • link 10.6 years ago by Devon Ryan 104k

0

Entering edit mode

Thank you for the reply dpryan.

Yes,I used the appropriate file to align with I mean, the output file generated by Trinity as the reference file(after indexing)in Bowtie and aligned it to the set of sequences(i.e cluster.fasta)the original file even before clustering the data by CD-Hit_EST and named as "sample.fasta" which is used for assembling.

I hope I made it clear and could you suggest me what can be changed with Bowtie2 command.

ADD REPLY • link 10.6 years ago by bambus0725 ▴ 50

0

Entering edit mode

Yeah, that makes sense then. Perhaps you didn't subsample enough of the reads with CD-Hit_Est (I've never used it). Regardless, a more normally formatted (and likely to work) bowtie2 command would be:

bowtie2 -f -N 1 -L 20 -a -t --un unpaired.fa --met-file metrics.txt -p 10 --reorder -x reference -U cluster.fa -S output.sam

ADD REPLY • link 10.6 years ago by Devon Ryan 104k

0

Entering edit mode

Okay.So do you mean the order of chosing the arguments is important?But,still I got the same result.

ADD REPLY • link 10.6 years ago by bambus0725 ▴ 50

0

Entering edit mode

Your original command wouldn't have actually worked at all, I can only assume that it's not what you actually typed. Aside from that, bowtie2 is only mapping that few reads because only that percentage is mappable (at least to individual contigs). My guess would be that the assembly just isn't that good. Perhaps try a different assembler or just try assembling the unmapped reads. You might also try mapping with bwa mem to see if a number of the remaining reads map across contigs.

ADD REPLY • link updated 18 months ago by Ram 44k • written 10.6 years ago by Devon Ryan 104k