Mapping exonic sequences across species
1
0
Entering edit mode
3.3 years ago

Hello there,

I need to find exons that are conserved between multiple species, such as human and mouse. I have tried to liftOver, but even though when I tweak the parameters to make it more sensitive, I lose most of my exons. To complement this approach, I am trying blat and it seems that is working fine. My only problem is that this approach sounds quite old fashion to me, does someone know the other way find conserved exons across species? I am sure there are many ways to do this, I am interested to know which would be your choice as a bioinformatician.

Cheers!

PD: At the moment I only want to map 20967 exonic sequences and blat is fast enough when I do:

blat $Genome$Query \$out -t=dna -q=dna -stepSize=5 -minScore=0 -minIdentity=0  -repMatch=1000000 -noHead


Perhaps even if you know if a better configuration of blat to do such mapping, would be very useful to star exploring this.

Update: With liftOver 856 mouse exons that overlap with annotated exons, whereas with blat I was able to find 11215 mouse exons that are conserved in human. Blat was not a bad option after all!

alignment genome • 773 views
0
Entering edit mode

Between human and mouse, the published set is generally good enough. As the third-codon position in a coding exon is the least functionally constrained, liftover can fail in capturing some regions in the 2nd species, especially among those that are phylogenetically distant. I've been mostly using liftover and blat over the years as well. With liftover, I generally follow a reciprocal best-hit strategy, i.e. conserved regions that are lifted from species 1 to 2 must be lifted similarly and uniquely in the reverse direction.

0
Entering edit mode

any links to this "published set"? I have only seen public lists of gene homologies, but they are not at the exon level.

1
Entering edit mode

This is one (https://www.ncbi.nlm.nih.gov/pubmed/22369432) and you can download the set at http://tdl.ibms.sinica.edu.tw/OrthoExon/download.html. IIRC, the annotations were hg18 and mm9, so you might need to lift it to newer versions.

Earlier this year, I was looking for ways to identify orthologous splicing events and found this paper. https://www.biorxiv.org/content/biorxiv/early/2018/03/06/277723.full.pdf. Haven't read it yet but thought it could be of interest to you as well.

0
Entering edit mode

Thanks a lot Eric! These articles looks very interesting, they are exactly on the spot of my question. I am happy other people is already putting efforts to implement fine-scale orthology mapping at the exon level.

0
Entering edit mode
3.3 years ago
GenoMax 120k

MGI has already done this for human/mouse. Take a look here.

NCBI Homologene/Ensembl Compara have multiple gene/genome comparisons available.

BTW nothing wrong staying old fashioned. Most of us are still routinely using blast and doing sequencing by DNA synthesis.

0
Entering edit mode

Hey! Thanks for your reply, however, I am looking for homologies at the exon level. But still, start cutting down the chances by the only map the exons to the homologous regions might be a good idea.