What do Ns represent in a SPAdes assembly from single-end reads?
1
0
Entering edit mode
6.1 years ago
kcamnairb ▴ 40

The fungal genome assembly I've done using SPAdes with single-end reads has stretches of Ns present. I would expect this with paired end reads but I'm not sure what they represent when the assembly is done with single-end reads. The spades options I used were --iontorrent and --careful. Do these N's represent ambiguous bubbles in the assembly or spots where there are mixed bases in the reads?

Thanks, Brian

Assembly spades • 1.8k views
ADD COMMENT
0
Entering edit mode

That could result from contig-formation/scaffolding.

ADD REPLY
0
Entering edit mode

Do you have many of them and/or long stretches?

ADD REPLY
0
Entering edit mode

There are 8 stretches of Ns in the assembly. Some of them are up to 512 bp, which is longer than the read length.

ADD REPLY
1
Entering edit mode
6.1 years ago

Yes, those Ns will represent ambigous base calls in the reads that the assembly part could not resolve. Non-resolvable bubbles will either be arbitrary popped or result in broken contigs (not 100% sure what SPAdes will do with them)

ADD COMMENT

Login before adding your answer.

Traffic: 3555 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6