Question: how to find a particular orthologous gene in a set of eukaryotic genomes
gravatar for natasha.sernova
6.3 years ago by
natasha.sernova3.7k wrote:

Dear all,

I would like to understand how to find a particular gene (the orthologous genes) in a set of eukaryotic genomes. The simplest way I see is to divide each chromosome into LOCUS-blocks, then read each block line by line, and if a particular name was encountered,  save that block.

But I see several problems. Is it correct that all these orthologs will have the same name?

I am not completely sure.

Another thing is the following - when I am reading the LOCUS block, I cannot stop reading after the first gene name appearence. I have to finish the reading line by line without paying attention to the next chance to see the same name - a particular name can be encountered several times per block. If I see it for the second time, the block should aquire another weight.

The gene name is usually in quotes. It doesn't matter for Linux search, isn't it?

my $block="";
my $blockisgood=0;
# Zero for false, 1 for true
# reading from input-file
while($line=<IN>) {
            next if $line =~ /^\s$/; # skip empty line
            if($line =~ /^LOCUS/) { #starting a new block
            # but print old block if it was good
                       if($blockisgood) {
                                   print OUT $block;
            # and reset

            # now check for blocks we are looking for
            if($line =~MYGENE/) {$blockisgood=1;}                
            $block .= $line;

#the last block printing to output
print OUT ($block) if $blockisgood;

This doesn't want to work for my gene. Please, help me!

Thank you very much!



gene genome • 1.6k views
ADD COMMENTlink modified 6.3 years ago by Josh Herr5.7k • written 6.3 years ago by natasha.sernova3.7k


Genes have not necessarily the same name (even if annotators try to do that the most as possible). It will depend of genomes you use. 

If i well understand you try to do analyse on synthenic regions. If information (i.e gene name) are not common between your different genomes you have  to verify if the genes are orthologs. To do that, the most accurate (but the most difficult) is to use a phylogenetic approach. Most of people prefer use the approach of similarity between the sequences in order to define (assume) the relationship between the sequences. It is really easier to setting up but is bit less accurate. 

If your genomes are known (as example present in Ensembl database) you can also use their relationship annotations between the sequences of different species. 

ADD REPLYlink written 6.3 years ago by Juke344.8k

Yes, you are absolutely right! I try to analyse synthenic regions. Could you, please, give me some details - how to use a phylogenetic approach as the most reliable one? What tools do exist for doing that? Will Ensemble help with different ortholog names? I don't need just the closes right-left neghbour, I would like to see at least a few genes to the left and to the right. How to do it correctly? Many thanks!

2014-07-22 12:57 GMT+04:00 Juke-34 on Biostar <>:

ADD REPLYlink written 6.3 years ago by natasha.sernova3.7k

I can advise you to read this:

It seems to me that I already saw automated tools for syntenic region analysis/detection in several congress. I think you should to spend more time on your bibliography.

If your genomes are in Ensembl Database and you use the Perl programmation, it should not be to difficult to program a pipeline that does what you want. For each gene it is possible to know the localisation and the list of ortholog/paralog genes.

ADD REPLYlink written 6.3 years ago by Juke344.8k

Thank you, it's a very nice paper! I will ask for their code.

2014-07-25 18:07 GMT+04:00 Juke-34 on Biostar <>:

ADD REPLYlink written 6.3 years ago by natasha.sernova3.7k
gravatar for Josh Herr
6.3 years ago by
Josh Herr5.7k
University of Nebraska
Josh Herr5.7k wrote:

This is a common question here; it looks like you didn't search much before posting your question.

This question (What Is The Best Method To Find Orthologous Genes Of A Species?) is a great place to start -- you'll want to craft the answers here to accommodate different genomes but a one-to-one BLAST would be the way to go.

These questions may help you also: How Can I Identify Orthologous Contigs Between Two De Novo Transcriptome Assemblies?Identify Common Orthologs In 3 Genome, and How to get all the orthologous genes between two species.

You should also check out OrthoMCL, among other tools.

ADD COMMENTlink modified 6.3 years ago • written 6.3 years ago by Josh Herr5.7k

Yes, you are right, I didn't search a lot, sorry! You gave me a perfect link as an example, I will try to use their approaches. Many thanks!

2014-07-22 18:00 GMT+04:00 Josh Herr on Biostar <> :

ADD REPLYlink written 6.3 years ago by natasha.sernova3.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1815 users visited in the last hour