losing large amount of reads when using Qiime2 vsearch joinpairs command
0
0
Entering edit mode
23 months ago
jobie1 ▴ 30

Hi there! I am running various sediment samples (Illumina for amplicon 16S - V4V5 sequencing) through the Qiime2 pipeline, however when I use vsearch joinpairs to join my paired end reads I lose a bunch of reads. For reference, here is a breakdown of my workflow for one sample:

Before trimming reads = 79 489 reads

after trimming = 75 542 reads

after joining = 28 005 reads

after filtering = 28 004 reads

Could anyone give some explanations as to why this may be happening? I figured there may be a few reasons but hopefully someone can help

Here is the command I used

qiime vsearch join-pairs --i-demultiplexed-seqs csm-vs-birds/PE-trimmed-reads.qza --o-joined-sequences csm-vs-birds/PE-trimmed-joined-reads.qza

16S Qiime2 amplicon joining • 1.4k views
ADD COMMENT
2
Entering edit mode

The length of the V4V5 region is quite long (~400 base pairs) compared to the V4 region (~250 base pairs), so you really need to be careful with your trimming parameters; otherwise, you won't be able to maintain enough overlap between your reads to join them. Take another look at those, see if your parameters were too aggressive, relax them a bit if possible, and see what happens. Good luck!

ADD REPLY
0
Entering edit mode

Thank you Chris! I'm using paired-end reads, here is a breakdown of the first few steps of my workflow, the filtering step happened after the joining step, unless this is not the common practice?

I ran a QC and I know would trim around a length of 250 based on the phred score but otherwise I am not sure what parameters I would adjust in any of the following commands to improve this, as most of my reads are lost after joining. Wouldn't the trimming stage only take away the adapter sequences and any sequences which are missing the adapters?

Here is the general workflow of my first few steps before constructing ASVs:

  1. import with qiime tools import

  2. trim with command cutadapt trim-paired

  3. join with command vsearch join-pairs

  4. filter with command quality-filter q-score

  5. construct ASVs...

And here are the parameters for the vsearch join-pairs command, but I am unsure if this is what I would have to adjust to receive a higher read count, the trim command does not offer many parameter settings related to read length:

--p-truncqual INTEGER Truncate sequences at the first base with the Range(0, None) specified quality score value or lower. [optional]

--p-minlen INTEGER Sequences shorter than minlen after truncation are Range(0, None) discarded. [default: 1]

--p-maxns INTEGER Sequences with more than maxns N characters are Range(0, None) discarded. [optional]

--p-allowmergestagger / --p-no-allowmergestagger Allow joining of staggered read pairs. [default: False]

--p-minovlen INTEGER Minimum overlap length of forward and reverse reads Range(0, None) for joining. [default: 10]

--p-maxdiffs INTEGER Maximum number of mismatches in the forward/reverse Range(0, None) read overlap for joining. [default: 10]

--p-minmergelen INTEGER Range(0, None) Minimum length of the joined read to be retained. [optional]

--p-maxmergelen INTEGER Range(0, None) Maximum length of the joined read to be retained. [optional]

--p-maxee NUMBER Maximum number of expected errors in the joined read Range(0.0, None) to be retained. [optional]

--p-qmin INTEGER Range(-5, 2, inclusive_end=True)

ADD REPLY

Login before adding your answer.

Traffic: 2146 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6