DADA2 implemented via wrapper python scripts in
QIIME2 or directly in
R can be as automatic as you want. You can write a bash script, snakemake or nextflow workflow (among many others) stating the order and dependencies of every step to run. In this way from a given input you'll have a desirable effortless output with your final results.
An automatic pipeline as described above works relatively well for very standardized analyses and data where you expect that data and results will behave within the standards. Of course, that even in such pipeline you will run QC steps that will produce QC plots which you should check to be certain that the data actually follows your expectations regarding its quality.
Are there any reliable tools that can indicate where I should trim my forward and reverse reads based on input fastq?
This depends on your criteria. There are tools that produce QC plots that help you decide about which values to choose and define as thresholds to trim and truncate your reads such as FastQC and MultiQC (which depends on the results produced by FastQC). The colors in these plots give you an indication of quality. There are many guides online explaining how to interpret these plots.
I would say that quite often people use recommended values which are not "bullet proof" but they work for standard data quality. You can attempt to use the values provided by the tutorial/qiime2 wrapper if you have the same type of data (which I think the values are intended for Illumina data - but I'm not sure) and hyper-variable region of 16S (if you are using a different hyper-variable region or read-length the values may not be adequate). In general, if you are not very familiar with the data that you're working on neither the analyses, I wouldn't recommend this approach as you may run things that you even don't know what they meaning and their implications. In that case, automatic and simple wrappers like the one you pointed out might be dangerous as it "hides" many sequential steps which makes difficult to understand their order and functionality.
I think the plugin that you pointed out implements in general the
DADA2 workflow in
DADA2 is an
R package). Therefore, I would recommend that you check the
DADA2 tutorial which may help to understand better the workflow, the order of steps, their implication and options implemented in the
QIIME2 wrapper: https://benjjneb.github.io/dada2/tutorial.html
Is it not advised to merge all the reads, run through something like FastQC, and then find the region where the 25th percentile is lower than 30?
There are different possible approaches. Usually I would trim and truncate the forward and reverse reads of 16S rRNA sequences indenpendently before merging them. The
DADA2 tutorial also follows this approach. Then you denoise your sequences (after learning the error rates), and only then you merge them.
I hope this helps,