Filtering qscore on dorado
2
0
Entering edit mode
7 months ago
eebloom ▴ 80

I am looking to re-basecall some ONT long-read data which was originally basecalled using Guppy.

Previously, with guppy, min_qscore=9. Would it be a good idea to do similar with dorado using the --min-qscore paramater?

Or would it be better to filter further downstream of basecalling e.g. using nanofilt

dorado filtering QC nanopore Guppy • 1.3k views
ADD COMMENT
1
Entering edit mode
7 months ago
GenoMax 143k

Depends on your use case. If you have a good reference then perhaps no filtering may also work. If you are trying to do de novo work then stay with the default.

ADD COMMENT
0
Entering edit mode

Thanks for your speedy reply!!

In the Guppy protocol it states "Minimum q-score (--min_qscore): The minimum q-score a read must attain to pass q-score filtering. The default value for this varies by configuration, ranging from 7.0 for the lower-accuracy models up to 10.0 for the "Sup" models."

Do you know if the same applies for dorado, i.e. if I use a HAC model, e.g. dna_r10.4.1_e8.2_400bps_hac@v4.2.0 , will this filter reads into pass and fail folders with a minimum qscore as before or would I need to set the --min-qscore parameter?

ADD REPLY
0
Entering edit mode

It appears that while dorado will remove reads that fail the filter it does not seem to create the "pass/fail" folders like guppy does. This filtering happens only when you provide a Q score value as minimum.

No reads are filtered by default, if you do not provide a min score.

ADD REPLY
0
Entering edit mode
7 months ago
eebloom ▴ 80

from the dorado developers:

"The --min-qscore would need to be used to replicate the guppy behavior.

Note that once a --min-qscore is set, dorado won't output those filtered reads at all. i.e. only the reads that pass the filter will be output to the stream.

After some internal discussion, we think it's best you keep all the data (i.e. not filter based on q-score) because sometimes even the low q-score reads can make a difference in downstream analysis. May be more applicable in the case of non-human data."

ADD COMMENT

Login before adding your answer.

Traffic: 1973 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6