Question

AMRFinderPlus input fasta file for clinical isolates

0

Entering edit mode

4 months ago

Adriana • 0

I recently ran NCBI's AMRFinderPlus to detect antimicrobial resistance on two different datasets of clinical isolates of Mycobacterium tuberculosis. Since the input file requires is a fasta file, I had fastq input files which I then aligned to a reference genome using BWA-MEM, used bcftools to create vcf files (variant calling), compressed and indexed them to then run bcftools consensus to generate a consensus genome assembly fasta file for each single isolate to use as input for AMRFinder. However, the two different datasets had very similar results (which doesn't make much biological sense) and I was wondering if I should follow a different approach to generate the fasta input files from the fastq isolate files. I saw SPAdes is common in this aspect but I read it's best for de novo assembly, and since I'm interested in SNPs I have to do a reference-based assembly. Here's one of the output tsv files from AMRFinder:

Protein id  Contig id   Start   Stop    Strand  Element symbol  Element name    Scope   Type    Subtype Class   Subclass    Method  Target length   Reference sequence length   % Coverage of reference % Identity to reference Alignment length    Closest reference accession Closest reference name  HMM accession   HMM description
NA  NC_000962.3 314347  314889  -   aac(2')-Ic  aminoglycoside N-acetyltransferase AAC(2')-Ic   core    AMR AMR AMINOGLYCOSIDE  GENTAMICIN/TOBRAMYCIN   EXACTX  181 181 100.00  100.00  181 WP_003899880.1  aminoglycoside N-acetyltransferase AAC(2')-Ic   NA  NA
NA  NC_000962.3 2231620 2232156 +   erm(37) 23S rRNA (adenine(2058)-N(6))-methyltransferase Erm(37) core    AMR AMR LINCOSAMIDE/MACROLIDE   AZITHROMYCIN/CLARITHROMYCIN/CLINDAMYCIN/ERYTHROMYCIN/TELITHROMYCIN  EXACTX  179 179 100.00  100.00  179 WP_003900446.1  23S rRNA (adenine(2058)-N(6))-methyltransferase Erm(37) NA  NA
NA  NC_000962.3 2325827 2326747 -   blaC    class A beta-lactamase BlaC core    AMR AMR BETA-LACTAM BETA-LACTAM EXACTX  307 307 100.00  100.00  307 WP_003410677.1  class A beta-lactamase BlaC NA  NA

the only differences I saw among the two datasets were ONLY in the start and stop columns, the rest - subclass, closest reference accession, closest reference name... - were exactly the same. One of the datasets is multiresistant and the other is monoresistant, I've already verified with Mykrobe, TBProfiler and ResFinder.

Has anyone here used AMRFinder with clinical isolate fastq files before? If so, how did you obtain the fasta files to use as input?

AMRFinder • 592 views

ADD COMMENT • link updated 4 months ago by GenoMax 154k • written 4 months ago by Adriana • 0

1

Entering edit mode

Please read through this thread --> Generating consensus sequence from bam file

You may want to do samtools consensus instead of the procedure above and see if that helps. Michael's answer explains what happens when one follows the procedure you used above.

ADD REPLY • link 4 months ago by GenoMax 154k