Question

Mutations starting from reference genome

0

Entering edit mode

3.8 years ago

Jacopo • 0

I'm using DRAGdb(a repository of mutational data of drug resistance associated genes) to find known mutations in a certain gene. I searched for Mycobacterium tuberculosis as bacteria and katG as gene and i founded, for example, a SNP at position 5:

GENE_NAME   ENSEMBL_ID  ORGANISM_NAME                   DRUG_NAME   NUCLEOTIDE_POSITION NUCLEOTIDE_MUTATION AMINOACID_POSITION  AMINOACID_CHANGE
KatG            Rv1908c         Mycobacterium tuberculosis  Isoniazid   5                   A - T                   3                   Glu - Val

I want to check if at position 5 of katG in the reference genome(Reference Sequence NC_000962.3) there is A; so i take the portion of the genome that corresponds to the gene that, in agreement with the genbank file of the same reference sequence, starts at 2.153.889 and finish at 2.156.111. Then i had to do the reverse_complement of that, always according with information on genbank file. Doing this i think that i have the right sequence of the gene, but if i look at the position 5 there is no A.

from Bio import SeqIO
fasta_file = "NC_000962.3.fasta"
fastaTbGenome = SeqIO.read(open(fasta_file, "r"), "fasta")
comp_katG = fastaTbGenome.seq[2153889:2156111]
katG = comp_katG.reverse_complement()
print(katG[5])

Output:

Is this method wrong? Thank you all in advance

h37rv genome mutations reference dragdb • 617 views

ADD COMMENT • link 3.8 years ago by Jacopo • 0