Amplicon analysis from nanopore sequencing
7 weeks ago
H.Hasani ▴ 990


I'm trying to understand the main differences in analysing amplicon data that was sequenced by oxford nanopore (ONT) vs. typical Ilumina workflow.

The main purpose of the analysis is to define & count clones of the insert region.

The first steps are self-explanatory: Basecalling -> QC -> demultiplexing (if multiple samples were pooled)

However, I'm a bit confused about the following steps: assembly -> clustering -> dereplication .

My basic understanding is that the assembly step is meant to extract the insert region, the confusing part, since it envolves large amount of sequences how can we retrieve first the insert efficiently? how does one drop the backbone first from this mountain of data efficiently? do we need to extract the insert region read by read using the primers marking the regions?

In typical Ilumina analysis, one would have to create an error model (e.g. with DADA2), my understanding is that if one would do a reference-free assembly then the error model is a must. In case of ONT, is DADA2 the right approach? or do we do in-silico PCR?

Finally, it would be great if one could point me to a literature or some sort of information that explains these steps.

Thank you

ASV ONT amplicon • 190 views

