Question

how do you extract genes from genomes?

0

Entering edit mode

4.0 years ago

AbdelAbdel ▴ 30

I'm working on a bacterium that secretes three types of toxins. my goal is to extract these genes to make phylogenetic analyses based on them. if you have any help to do this task was not to help me and thank you.

genome sequencing assembly • 607 views

ADD COMMENT • link updated 3.3 years ago by Biostar 20 • written 4.0 years ago by AbdelAbdel ▴ 30

0

Entering edit mode

I'm creating a pipeline for this actually. What I've done is to blastn the secuences of the reference genes and the genomes (multifasta on both cases) then loop using R, UNIX or other into the files generated by blastn in order to get the coordinates of the genes on each genome and classify them into plus/plus or plus/minus, get those on a matrix and create bed files for each pair of coordinates. Bedtools can be used from UNIX/linux and has "getfasta", it works over the set of bed files and cuts the genomes in those coordinates; those fragments can be compiled into fasta files, each fragment would be each gene on each genome. It worked perfectly for me but you have to be careful to with the plus/minus genes to use the reverse complementary for analysis. Hope it helps

ADD REPLY • link 3.3 years ago by cechersa • 0

score 1 · Answer 1 · 2020-04-28

1

Entering edit mode

4.0 years ago

GenoMax 141k

If you have plain sequence, use a gene prediction program (easy for bacteria) or better yet an annotation pipeline like prokka.

Once the genes are identified and annotated it should be simple to extract sequences based on homology to known things and create a phylogeny.

ADD COMMENT • link 4.0 years ago by GenoMax 141k