Difference between mate-pair, paired-end and long read
11 months ago
Chvatil ▴ 60

Hello everyone, I writing here because I have some questions for you.

I wondered what the essential differences were between paired-end, mate-pair and long read?

for me mate-pair and classic paired-end are both paired-end reads, with the difference that :

1. For classical paired-end: the insert size on classic paired-end is smaller (about 500bp)
2. while the insert size of mate-pair is much longer (several Kb) which allows to join the contiguous between them especially is it?

In revenge for the long-reads, I imagine that they are simply reads that are synthesized with a large read size but that do not allow like the maite-pair to make junctions between contigs?

Hello Chvatil!

It appears that your post has been cross-posted to another site: https://bioinformatics.stackexchange.com/questions/14575/difference-between-paired-end-mate-pair-and-long-read

This is typically not recommended as it runs the risk of annoying people in both communities.

Oh sorry I did not know that, someone gave me an answer then I cannot delete it now..

Leave it as is, don’t worry. It is just generally not good practice as it spreads information across two communities.

11 months ago
GenoMax 106k

Your idea of paired-end reads is correct. These are libraries of ~300-500 bp inserts where both ends are sequenced. So paired-end reads sequence inwards (--->______<---).

Mate-pair reads on the other hand are from special libraries where the insert is large. Library fragments are circularized so the two ends then come into proximity. This makes a new sequencable construct. (LINK) In this case you get to sample the ends of that large insert.

Long read libraries are specially made from large MW DNA that is carefully handled not to cause breaks so you end up with "libraries" with fragments that are Kb's (or even Mb long). Depending on success a technology like ONT will be able to sequence through that entire stretch of DNA in 5'-->3' direction.

Thank you for this explanation, so If I'm not wrong we can use long read and mate pair in order to assemble contigs together (scaffolding) right ? I'm asking this question because wen I have assemblies made only from short paired-end, I have incorrect assembly where I see misassembly breakpoints at positions in the contigs where the left flanking sequence aligns over 1kb away from the right flanking sequence on the reference, or they overlap by 1kb, or the flanking sequences align on opposite strands or different chromosomes.

Then to correct that I use the software REARP .

So tell me if I'm wrong, but for a genome assembly made of hybrid (short paired-end + mate-pair) or (short paired-end + long read) this break point can be explananed because of the (mate pair read of long read) that made the junction between contigs right ?