Assembly: collapsed reads vs overlapping paired-end reads?
1
0
Entering edit mode
7.6 years ago
palu • 0

Hello all!

Case:

We did a Sequencing of three yeast strains (not s.cerevisiae) and with reference genome available!

Illumnia MiSeq

around 8,5 million reads each way (paired ends)

most reads around 90-120 bp long

high coverage: around 90x

In the trimming /adapter removal step we saw most of our paired end reads have overlaps

R1 ------------------------->
R2 <-----------------------

99% overlapping paired end reads 1% non-overlapping paired end reads

Question:

Is it better for the next assembly steps (initial contig building, scaffolding)

-> to only use single-end reads: Collapse the overlapping paired end reads (99 %) into single end reads (since assemblers can have problems with overlapping paired end reads) and use only this single end reads for the assembly (discard the 1%) ?

-> to only use paired-end reads: use the overlapping paired end reads (99%) and the non-overlapping paired end reads (1%)?

-> to use a mix: single and paired-end reads: Collapsed into single end reads (99%) and non-overlapping paired end reads (1%).

Thanks

Assembly genome sequencing alignment • 3.1k views
ADD COMMENT
0
Entering edit mode

did you designed to be overlapping ? what was the insert size that you expected ?

ADD REPLY
0
Entering edit mode
7.6 years ago

This depends on the assembler. But for Spades, for example, I recommend merging the reads with BBMerge, then assembling with both merged reads and unmerged pairs.

ADD COMMENT

Login before adding your answer.

Traffic: 1983 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6