Question: Very high alignment rate with bowtie2
0
gravatar for TrentGenomics
9 months ago by
TrentGenomics30 wrote:

Hello,

My alignment rates with bowtie2 are ~95%, using paired-end reads on a de novo Trinity assembly. What I typically see in the literature is bowtie2 alignment rates of anywhere from 70-85% which is considered good.

What is the reason for the rates being so high? I know that it is because most of my reads are represented in the final set of contigs. I'm just trying to justify why my rates high compared to examples from the literature.

Any info would be great! Thanks.

rna-seq alignment • 535 views
ADD COMMENTlink modified 9 months ago • written 9 months ago by TrentGenomics30

I assume the Trinity assembly was not assembled from the reads you are mapping?

ADD REPLYlink written 9 months ago by Damian Kao14k

The reads that are mapping back at high rates are the reads that I used to generate my Trinity assembly.

ADD REPLYlink written 9 months ago by TrentGenomics30

Then no surprise they map well on a merged version of themselves.

ADD REPLYlink written 9 months ago by Macspider2.4k

I'm not sure that that is the full answer, though. If you do the assembly incorrectly, you may only get rates of around 60%-70% re-alignment. It is actually difficult to do de novo assembly and get high re-alignment rates with Illumina reads, so, kudos to michbrown!

I typically get around 90%

ADD REPLYlink written 9 months ago by Kevin Blighe21k

I do agree Macspider, but I've seen many examples where assembly quality checks by way of read representation do not exceed the range of 70-85 %.

ADD REPLYlink modified 9 months ago • written 9 months ago by TrentGenomics30

It is possible to achieve high alignments by using very high QC thresholds during read trimming and base-quality checks, i.e., prior to alignment. For example, if you specify that all reads must be >70bp in length and have base qualities >30 at the read ends, you can be pretty sure that you'll achieve upward of 99% alignment if the reference genome is good.

So, my question would be whether or not you did some rigorous QC checks on your reads prior to assembly?

ADD REPLYlink written 9 months ago by Kevin Blighe21k

Yes, that makes sense.I did use a >30 phred quality score when I used Trim Galore! Thanks for the explanation!

ADD REPLYlink modified 9 months ago • written 9 months ago by TrentGenomics30

No problen - I also use Trim Galore! I think that 30 is a reasonable cut-off to use, but some people go as low as 20 for this parameter. Illumina reads have known quality issues at the read ends based on how the fragments are sequenced in the instrument.

Another way to 'manufacture' a high alignment is to only include matched mate-pairs prior to alignment, and to throw out any lone mates that have no match.

I tend to conduct a 'raw' alignment with the raw FASTQ and then a secondary alignment with the QC'd FASTQ, compute resources and time permitted of course.

ADD REPLYlink modified 9 months ago • written 9 months ago by Kevin Blighe21k
1
gravatar for TrentGenomics
9 months ago by
TrentGenomics30 wrote:

These are just the tips I was looking for. I'm going to play around with the raw and trimmed reads with bowtie2 and have a look at the alignment stats.

Thanks again, Kevin! I appreciate it.

ADD COMMENTlink written 9 months ago by TrentGenomics30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 926 users visited in the last hour