How to create a local (custom) database with a multifasta DNA sequence file in Prokka?
8 months ago

Hi everyone, I'm trying to create a local (custom) database for prokka based on a multifasta file that contains a set of DNA sequences of different bacterial genes. It looks like this

>mecI:1:D86934
TTATTTTTTATTCAATATATTTCTCAATTCTTCTATTTCATCTTGTGATAGATCTTCTTTTTCTACAAAGTTTAAGACAAGTGAATTGAAACCGCCTTTGTATACTTTATTGATAAAGTTTTTAGATGTTTTATATTTTATATCACTTTCTTCTACAAGAGAGTAATATTGAAAAATTTTATTGTCTTTTTTACGATTA
>mecI:2:AB037671
TTAAAAAATTTTATTGTCTTTTTTACGATCTATAAATCCCTTTTTATACAATCTCGTTATAAGTGTACGAATGGTTTTTGGACTCCAGTCCTTTTGCATTTGTATTTCTTCTATTATATTATTCGCACTTGCATATTTTTCATCCAAATGATATTCATAACTTCCCATTCTGCAGATGATATTTCATACGTTTTATTATCCAT
>mecI:3:FJ670542
TTATTTTTTATTCAATATATTTCTCAATTCTTCTATTTCATCTTGTGATAGATCTTCTTTTTCTACAAAGTTTAAGACAAGTGAATTGAAACCGCCTTTGTATACTTTATTGATAAAGTTTTTAGATGTTTTATATTTTATATCACTTTCTTCTACAAGAGAGTAATATTGAAAAATTTTATTGTCTTTTTTACGATC
>mecI:4:FJ390057
ATGGATAATAAAACGTATGAAATATCATCTGCAGAATGGGAAGTTATGAATATCATTTGGATGAAAAAATATGCAAGTGCGAATAATATAATAGAAGAAATACAAATGCAAAAGGACTGGAGTCCAAAAACCATTCGTACACTTATAACGAGATTGTATAAAAAGGGATTTATAGATCGTAAAAAAGACAAT........


I´ve been trying to do it converting first, the inicial DNA multifasta file in to a PROTEIN multifasta file by using EMBOSS - transeq tool, then creating a blast database and indexing it in to the PROKKA database directory, and finally do the typing process of this custom database against a bacterial genome in order to get the gbk output of this process. However, I ask you for help if there is an easier way to do this because I'm loosing some sequence information caused by the EMBOSS-transeq process. May be if there is a way to do the setting the DNA database instead of the protein database.

Thank you guys.

database typing process prokka custom • 188 views