Phylogenomic Analysis Of The Glutaredoxin Family
3
6
Entering edit mode
13.6 years ago
Matthew ▴ 60

How do I compare a gene family for two plants whenever one plant genome has yet to be annotated?

To ask in a different way, I need to find out if Arabidopsis thaliana and Glycine max have the same number of glutaredoxin (grx) genes in the three subgroups.

However, although getting the genes for A. thaliana is pretty straight forward, how do I find the genes in G. max?

phylogenetics gene • 2.4k views
ADD COMMENT
7
Entering edit mode
13.6 years ago

I would recommend you to perform a genome comparison based on the Glutaredoxin domain. You can obtain an hmm model of grx domain from Pfam database and perform a targeted genome-wide search in Glycine max genome using HMMER3 . Domain information for Arabidopsis for Pfam is already available in TAIR ftp.

  • Step. 1. Download hmm model for your domain of interest (grx)

  • Step. 2. Search your genomes against this model using hmmpfam

  • Step. 3. Use e-value, coverage of domain or hmm score for finding the best domain association to query sequence.

You may also look in to this manuscript that describe similar approach for cross-genome comparison of plant genomes to identify gene products with Serine Proteases.

ADD COMMENT
0
Entering edit mode

thankyou guys...tremendous answers, and definitely helping me to get on my way

great article on Serine Proteases, with plenty of food for thought, although HMMER3 looks a little user unfriendly I'll give it a go and maybe compare results from a domain search with sequence analysis

cheers

ADD REPLY
5
Entering edit mode
13.6 years ago

If you go to the Phytozome database you will find that there is a preliminary annotation of genes in the soybean (Glycine max) genome. There is even a BLAST service offered that you can use to search your genes of interest against the preliminary set of protein products.

ADD COMMENT
4
Entering edit mode

You can never be 100% sure - you have to use your judgement, based on the quality of the G. max genome sequence and what you know about grx genes. The FAQ states that the sequence is more than 98% complete with respect to protein-coding genes. So if the grx gene groups are conserved enough to be detectable using BLAST, it seems unlikely that you'll miss any.

ADD REPLY
0
Entering edit mode

ive already looked there and although i can identify several genes, how do i ensure that ive got all the grx genes from glycine max?

ADD REPLY
0
Entering edit mode
13.6 years ago

I did such work using A. thaliana sequences to find certain disease-resistance loci in G. max. Here, synteny was the key - as we had some genomic DNA sequence. If this is not available, I would use an EST database from G. max because a very large percentage of the genes will be represented there. I expect that each grx gene in A. thaliana has one top match or ortholog in soybean. You may find that either Arabidopis or soybean has undergone expansion of some segment of this gene family, in which the comparison of known A> thaliana grx genes to the soybean ESTs will give something other than the one-one correspondence. Specifically, one Arabidopsis grx gene may match very well to a number of soy ESTs that themselves will cluster into two different mRNAs. To do this similarity search, I would use Arabidopsis protein sequences.

ADD COMMENT

Login before adding your answer.

Traffic: 1801 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6