Question

Paired-end read mapping principle

1

Entering edit mode

4.0 years ago

mohammadborooshaki ▴ 10

Hi. Sorry I am quite new to read mapping and alignment. I have a dummy question. I know what paired read is but my question is how aligning tools do aligning with paired reads? I want to know the principle of paired end read mapping. Tnx

alignment • 3.8k views

ADD COMMENT • link 4.0 years ago by mohammadborooshaki ▴ 10

0

Entering edit mode

Thank you Istvan for your answer. I was a bit confused about the alignments. So in paired end alignment we need to align both reads with the reference genome. I need to implement a alignment tools with paired end read. I have the read files (forward and reverse) and reference genome.Can you please explain briefly what steps should I take for that. It is unclear for me how to start. I was going to try suffix tree. Is it practical for this kind of search?

ADD REPLY • link 4.0 years ago by mohammadborooshaki ▴ 10

0

Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. This comment belongs under @Istvan's answer.

SUBMIT ANSWER is for new answers to original question.

ADD REPLY • link 4.0 years ago by GenoMax 141k

0

Entering edit mode

Consult the SAM specification,

https://samtools.github.io/hts-specs/SAMv1.pdf

having a paired-end alignment means filling in columns 7, 8 and 9 (RNEXT, PNEXT, TLEN). For single-end alignments these columns do not contain relevant data.

(Also terminology is important, see also the SAM spec above. It is not forward and reverse reads that you have. What you have are first in pair and second in pair)

As general advice don't start with paired end reads, first generate the single end SAM file, once that works you can fill back in the paired information very easily (relatively speaking of course, compared to the task of generating the single end alignment). For example the value of a POS column of a read is the same as the value of the PNEXT column of its mate.

ADD REPLY • link 4.0 years ago by Istvan Albert 100k

score 0 · Answer 1 · 2020-04-20

Paired-end alignment typically means keeping track and reporting the alignment of the booth pairs in a read pair.

Each read is aligned separately, and the information on both pairs is combined and reported in the same alignment line.

Basically, this allows other, subsequent analysis tools to infer additional information on the fragment just by reading a single line of the alignment.

In principle, you could take the two pairs, align them in single end mode then postprocess the two single end alignments and generate a paired-end file yourself from the existing information in the single end alignments. But the process would be tedious and wasteful, hence it is better to generate the paired end alignments at the time of the original alignment.