bowtie2 mapping oritation problem
1
0
Entering edit mode
3.2 years ago
Shaway • 0

Hello everyone, Sorry for post this naive sequencing question. I aligned a paired DNA sample to hg19 by bowtie2 with parameter "-q -N 1 -X 2000 --no-mixed --no-discordant". I saw some strange read pairs in my bam file like below (I deleted some ATCG for illustration). read1 can either be forward and reverse. This confused me.

(1)read1 is forward, chr1:10033-10110

reference    CCCTAACCCAACCCTAACCCAA
read1        CCCTAACCCAACCCTAACCC
read2          CTACCCCAACCCTAACCCTA

(2)read1 is reverse, chr1:10018-10095

reference    TAACCCAACCCTAACC
read1          ACCCAACCCTAACC
read2        TAACCCAACCCTAA

My question is :

(1) bowtie2 use --fr in alignment procedure by default. Is this means read1 should come from forward strand and read2 from reverse strand?

(2) How the above phenomenon occured?

(3) If this is normal, how to determine the start and end of the DNA fragment? which is 5' and which is 3'?

Thanks everyone!

bowtie2 WGS DNA sequencing • 1.2k views
ADD COMMENT
1
Entering edit mode

For paired end sequencing you are sequencing from both ends of a DNA fragment. If the fragment size is small, the reads can overlap each other. For example, if your fragment length is 100 bases, and you are sequencing 75 bases from each end, the forward and reverse reads will have 50 bases of overlap in the middle.

ADD REPLY
0
Entering edit mode

Yes, I know this. But why read1 can either be from forward and reverse strand. I think that read1 should be the leftmost segment of the DNA fragment.

ADD REPLY
0
Entering edit mode
reference    TAACCCAACCCTAACC
read1          ACCCAACCCTAACC
read2        TAACCCAACCCTAA

In this situation, you mean that the DNA fragment should be "ACCCAACCCTAA". But adapters has already been removed, therefore I think the DNA fragment should be "TAACCCAACCCTAACC".

ADD REPLY
1
Entering edit mode
3.2 years ago
h.mon 35k

The reference genome has only one strand just for convenience, the actual DNA molecule has two complementary strands. During library preparation, the forward and reverse adapter can link to any of the two DNA strands, hence, you will see all combinations of (read 1 / read 2) and (forward strand / reverse strand).

The decision on which strand is used for genome representation is somewhat arbitrary, but, for organisms with good cytological and genetic maps, the reference genome chromosomes are positioned to make it colinear and with the same orientation with these maps.

ADD COMMENT
0
Entering edit mode

Thank you very much!

ADD REPLY

Login before adding your answer.

Traffic: 2252 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6