I'm working on a bacterium that secretes three types of toxins.
my goal is to extract these genes to make phylogenetic analyses based on them.
if you have any help to do this task was not to help me and thank you.
I'm creating a pipeline for this actually. What I've done is to blastn the secuences of the reference genes and the genomes (multifasta on both cases) then loop using R, UNIX or other into the files generated by blastn in order to get the coordinates of the genes on each genome and classify them into plus/plus or plus/minus, get those on a matrix and create bed files for each pair of coordinates. Bedtools can be used from UNIX/linux and has "getfasta", it works over the set of bed files and cuts the genomes in those coordinates; those fragments can be compiled into fasta files, each fragment would be each gene on each genome. It worked perfectly for me but you have to be careful to with the plus/minus genes to use the reverse complementary for analysis. Hope it helps
If you have plain sequence, use a gene prediction program (easy for bacteria) or better yet an annotation pipeline like prokka.
Once the genes are identified and annotated it should be simple to extract sequences based on homology to known things and create a phylogeny.
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy