Question: Why alignment is invalid if a start/end coordinate is contained within the other read?
0
gravatar for weishang
4 weeks ago by
weishang0
China/Shanghai/Shanghai Ocean University
weishang0 wrote:

The manual of Trim Galore says : Trims 1 bp off every read from its 3' end. This may be needed for FastQ files that are to be aligned as paired-end data with Bowtie. This is because Bowtie (1) regards alignments like this:

R1 --------------------------->

R2 <---------------------------

or this:

R1 ----------------------->

R2 <-----------------

as invalid (whenever a start/end coordinate is contained within the other read).

But I still can't understand why alignment is invalid if a start/end coordinate is contained within the other read?

Could some people explain that one to me?

Thanks a lot.

ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by weishang0

I'd say by default you'd expect your paired-end reads to have quite large distance between them. Alignment where two reads are overlapping may mean that these reads come from repetitive region - the real distance between reads is large, but since both reads contain information about the same sequence (the region is repetitive), it maps them close to each other. However, <0 insert distance is not unusual for exome sequencing. I think I never faced this problem using bwa-mem.

ADD REPLYlink written 4 weeks ago by kuckunniwid580
2
gravatar for h.mon
4 weeks ago by
h.mon28k
Brazil
h.mon28k wrote:

The title asks how to understand Trim_Galore --trim1 parameter, but apparently the real question is "why alignment is invalid if a start/end coordinate is contained within the other read?"

I don't know why bowtie considers an alignment invalid if a start/end coordinates is contained within the other read start/end coordinates. It could be a design decision (this would be my guess, probably to avoid funky reads - read more bellow), or it could be some limitation impose by Bowtie algorithm. The fact is bowtie manual clearly states this:

Paired-end alignments where one mate's alignment is entirely contained within the other's are considered invalid.

edit: bowtie source code comments at the file aligner.h hints it is a design decision:

    // Set begin/end to be a range of all reference
    // positions that are legally permitted to be involved in
    // the alignment of the outstanding mate.
    //
    // Note that one of the constraints imposed on which positions
    // go into this range is that the opposite mate cannot be
    // contained entirely within the anchor mate, or vice versa.

and:

        // We can also add a bit more if qlen is less than alen,
        // since we're requiring that opposite not be contained
        // within anchor.
ADD COMMENTlink written 4 weeks ago by h.mon28k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1418 users visited in the last hour