For my studies, i'm using an Antimicrobial resistance predictor called Mykrobe(i use the command line version). I'm not expert an expert of Mykrobe, instead a beginner user, so take what i say with a grain of salt.
Mykrobe takes as input high throughput sequences(SRA) and nanopore data(that i never used). In the command for prediction you have to indicate the species to which reads belong, in order to let mykrobe uses the inner panels. These panels contain information about mutations and its related effects in terms of resistance to drugs.
Now, talking about the output, is formatted in this way:
{
"sample_name": {
"susceptibility": { ... AMR call information ... },
"phylogenetics": { ... species/lineage information ...},
"kmer": <integer>,
"probe_sets": [ ... list of probe files ...],
"files": [ ... list of input reads files ... ],
"version": { ... version information ...},
"genotype_model": "name of genotype model"
}
}
Susceptibility and Phylogenetics are the dictionaries that i'm interested in. In the first i have the resistance calls, in the latter information about the species.
Said that about mykrobe, i want to use as input a subset of a run. For example: considering the run ERR949847(Mycobacterium tuberculosis H37Rv), i want to extract only the reads related with a specific gene and i'm able to do that, but using the whole run and the subset i obtain discordant results. When i use the entire run mykrobe correctly predict the resistance and shows this in phylogenetics:
"phylogenetics": {
"phylo_group": {
"Mycobacterium_tuberculosis_complex": {
"percent_coverage": 27.694,
"median_depth": 2.0
}
}
with other information. Using instead the subset, i obtain this:
"susceptibility": {},
"phylogenetics": {
"phylo_group": {
"Unknown": {
"percent_coverage": -1,
"median_depth": -1
}
},
empty Susceptibility and unknown Phylogenetics. The most obvious thing is the lack of data; enitre run is 319.8M bases while the subset is only 14 reads of 75 bases each ( ~ 1.050 bases). I tried another subset of 14998 reads ( ~ 1.124.850 bases ) and i didn't get any relevant result.
Is this the correct way to use mykrobe? How can i get results starting from the subset?
Thanks a lot in advance!