Question

Analyzing non overlapping ITS reads

0

Entering edit mode

2.4 years ago

francesco.vitali.bio • 0

Hello!

I have a dataset obtained with Illumina Miseq with 2*250 sequencing reads on the full lenght fungal ITS region. As the amplicon is longer than 600 bp, my R1 and R2 reads do not overlap. My objective is to obtain OTU/ASV table with taxonomic classification

I have already performed "classical" picking and taxonomy assignment (USEARCH) independentrly for the R1 and R2 reads, with overall comparable results. The taxonomy classification, however, is not particularly good as lots of OTUs got classified in SH at the sp. level.

I would like to improve taxonomic classification (if possible) over using only the R1, and I was looking around for methods for mapping non-overlapping reads on the UNITE database retaining taxonomic information (as for now, candidates could be CENTRIFUGE or KRAKEN as far as I understood).

I was wondering, before really beginning to test, if this approach would actually give me better taxonomic classification results or if I would end up with results very similar to what already obtained for the R1 alone.

Does anyone have experience with something similar? Did anyone obtain better taxonomic classification (i.e. higher number of species level identification) with read alignment respect to R1 alone?

Thanks, Francesco.

merging amplicon sequencing fungi • 752 views

ADD COMMENT • link updated 2.4 years ago by patrickdm ▴ 230 • written 2.4 years ago by francesco.vitali.bio • 0

0

Entering edit mode

You may be able to use some of the tools mentioned in this thread. Be careful though in not over extending the reads: How to extend contigs from single-end reads?

ADD REPLY • link 2.4 years ago by GenoMax 141k

score 0 · Answer 1 · 2021-11-29

Hello, you could run DADA2 on both R1 and R2, using mergePairs(..., justConcatenate=TRUE) option:

(it) allows the paired reads to be joined without any overlap, but with 10Ns inserted in between the forward and reverse reads. The chimera removal and assignTaxonomy functions will handle such merged reads, although some other functions may fail (eg. addSpecies)

I've been using it on a recent project. I was obtaining very few mergers, so I started with an R1-alone first, then I tried R1-R2 concatenation and I could obtain classification improvements compared to the analysis of R1 alone.

Hth