comparing three genomes?
4
0
Entering edit mode
6.0 years ago
kxd419 ▴ 10

Hello,

I have sequenced a bacterial genome. I want to use a venn digram comparing it to two other already sequenced genomes. Something like this: 
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2953697/figure/f2/


However I am unsure how to go about it. Should I used my genome as a blast db then blast the two known genomes against this?
If so do I then take the genes present and absent in both known genomes and blast them against each other?

 

Kind regards,

KXD

blast gene genome • 2.1k views
ADD COMMENT
1
Entering edit mode
6.0 years ago
5heikki 9.6k

One option would be to query the proteins of the 3 genomes against e.g. pfam (with hmmer) and then extract the number of shared features between the proteomes from that. Also, maybe they tell in the text or MM what they actually did there..

ADD COMMENT
0
Entering edit mode
6.0 years ago
HG ★ 1.1k

Simple way : 1. Annotate the Genome 2. Cluster the gene (cd hit/orthomcl...) 3. Find the share gene among all genome  4. Draw a venn digram (may be using http://bioinfogp.cnb.csic.es/tools/venny/)

ADD COMMENT
0
Entering edit mode
6.0 years ago
kxd419 ▴ 10

Hi HG,

Thanks for your reply.

The genome is annotated however all three genomes have different gene names.

Can you explain step two in more detail?

ADD COMMENT
0
Entering edit mode
6.0 years ago
HG ★ 1.1k

Extract all the gene from each file > blast all vs all (with your desire cutoff value using cdhit) >  You will get a list unique sequence and share sequence > count the number and plot

http://weizhongli-lab.org/cd-hit/

 

 

ADD COMMENT
0
Entering edit mode

I don't see this for a set of three transcriptomes, it only presents option for comparing two nucleotide databases, can you explain how you do this if you have three databases? Thank you

ADD REPLY
0
Entering edit mode

we are talking here "bacterial genome" not transcriptomes!!!!

ADD REPLY
0
Entering edit mode

and what is the difference for you? It still consists of fasta files with sequences right? The question is how do you do it for three sets of 'genes' (if you want) instead of two. The problem is that what you propose doesn't work when the gene names are not the same, and when there is gene expansion number in one genotype versus another. Also you need to do best reciprocal blast not just all_vs_all

ADD REPLY

Login before adding your answer.

Traffic: 1514 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6