Question: Very high alignment rate with bowtie2
0
gravatar for TrentGenomics
10 weeks ago by
TrentGenomics20 wrote:

Hello,

My alignment rates with bowtie2 are ~95%, using paired-end reads on a de novo Trinity assembly. What I typically see in the literature is bowtie2 alignment rates of anywhere from 70-85% which is considered good.

What is the reason for the rates being so high? I know that it is because most of my reads are represented in the final set of contigs. I'm just trying to justify why my rates high compared to examples from the literature.

Any info would be great! Thanks.

rna-seq alignment • 252 views
ADD COMMENTlink modified 10 weeks ago • written 10 weeks ago by TrentGenomics20

I assume the Trinity assembly was not assembled from the reads you are mapping?

ADD REPLYlink written 10 weeks ago by Damian Kao14k

The reads that are mapping back at high rates are the reads that I used to generate my Trinity assembly.

ADD REPLYlink written 10 weeks ago by TrentGenomics20

Then no surprise they map well on a merged version of themselves.

ADD REPLYlink written 10 weeks ago by Macspider1.6k

I'm not sure that that is the full answer, though. If you do the assembly incorrectly, you may only get rates of around 60%-70% re-alignment. It is actually difficult to do de novo assembly and get high re-alignment rates with Illumina reads, so, kudos to michbrown!

I typically get around 90%

ADD REPLYlink written 10 weeks ago by Kevin Blighe7.3k

I do agree Macspider, but I've seen many examples where assembly quality checks by way of read representation do not exceed the range of 70-85 %.

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by TrentGenomics20

It is possible to achieve high alignments by using very high QC thresholds during read trimming and base-quality checks, i.e., prior to alignment. For example, if you specify that all reads must be >70bp in length and have base qualities >30 at the read ends, you can be pretty sure that you'll achieve upward of 99% alignment if the reference genome is good.

So, my question would be whether or not you did some rigorous QC checks on your reads prior to assembly?

ADD REPLYlink written 10 weeks ago by Kevin Blighe7.3k

Yes, that makes sense.I did use a >30 phred quality score when I used Trim Galore! Thanks for the explanation!

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by TrentGenomics20

No problen - I also use Trim Galore! I think that 30 is a reasonable cut-off to use, but some people go as low as 20 for this parameter. Illumina reads have known quality issues at the read ends based on how the fragments are sequenced in the instrument.

Another way to 'manufacture' a high alignment is to only include matched mate-pairs prior to alignment, and to throw out any lone mates that have no match.

I tend to conduct a 'raw' alignment with the raw FASTQ and then a secondary alignment with the QC'd FASTQ, compute resources and time permitted of course.

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by Kevin Blighe7.3k
1
gravatar for TrentGenomics
10 weeks ago by
TrentGenomics20 wrote:

These are just the tips I was looking for. I'm going to play around with the raw and trimmed reads with bowtie2 and have a look at the alignment stats.

Thanks again, Kevin! I appreciate it.

ADD COMMENTlink written 10 weeks ago by TrentGenomics20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 689 users visited in the last hour