Entering edit mode
2.0 years ago
eebloom
▴
110
I have long-read data from oxford nanopore. Because the sequencing and basecalling was performed a while ago (a few years now for some of the samples), I want to re-do basecalling before downstream analysis (alignment and variant calling etc.) I am also interested in getting information on DNA methylation.
If I run dorado with --modified-bases, can I use the same resulting uBAM for all my downstream analysis (genomic and epigenomic) or should I run dorado twice, once with --modified-bases and once without?
If a relevant model is not available then I don't know if you can use
doradoat all. There are separate methylation models available for dorado for different pore types.unless it's too ancient, if the flowcell matches, it should be OK. I recently did something similar: had a rapid (RAD002) run, and Dorado only has Kit14 models for flowcells 9.4.1. However, the results were still _much_ improved comparing to the previous data I had basecalled with Guppy 4-something (I mapped both to the consensus sequence).
the original model used in Guppy was dna_r9.4.1_450bps_hac_prom So I assumed I could use the dorado model dna_r9.4.1_e8_hac@v3.3 I know e8.2 is kit 14 but I wasnt sure what e8 or e8.1 were and I couldn't find it anywhere online.
from a reply on the dorado github...
apparently they are updating the readme soon!
Say there is the correct model available, which I think there is, you can add the --modified-bases flag and dorado will select the relevant modified base model. What I wondered was if the resulting uBAM containing information on methylated bases could also be used for downstream analysis to save compute time/resource and storage?
The dorado developers say that modified base calls should be the same/suitable for any downstream analyses
[https://github.com/nanoporetech/dorado/issues/469#issuecomment-1808151652][1]