Question: Extracting homologous proteins from genome ( blat or exonerate)
0
gravatar for ricardoguerreiro2121
8 months ago by
Germany
ricardoguerreiro212160 wrote:

Hi,

I would like to quickly extract proteins from various novel plant genomes, by finding homology with documented proteins (ex: A. thaliana), for the purpose of phylogenetic analysis.

A recent paper works with an old tool, Blat, that does just that. But the results of blat are a table of hits (with coordinates). How do transform this into proteins? I have created a script that parses my query DNA sequence based on the hit coordinates, but this doesn't seem ideal, I would have to translate the DNA there are 6 diferent ways of translating..

Does anyone know blat here? Or any nice easy alternative? Exonerate seems to do the same and also outputs alignments against my putative translated proteins, but I don't know how to extract anything from this format..

EDIT: I'm getting close to it with:

exonerate --model  protein2genome  araport_genes.pep.fasta b_repanda.fasta    --showalignment no --showvulgar no --ryo ">%ti (%tab - %tae)\n%tas\n"

Cheers, Ricardo

ADD COMMENTlink modified 8 months ago • written 8 months ago by ricardoguerreiro212160
1
gravatar for JC
8 months ago by
JC11k
Mexico
JC11k wrote:

From the paper mentioned:

Contig identity was assigned with Blat v.35 using translated DNA against the respective exon reference sets, selecting the highest scoring hit, and contigs with score > 20 and percentage identity > 75% were retained

The author didn't align the nucleotides from the genome, they translated the contigs translating it to the respective proteins.

For your analysis, I think you can annotate your sequences using the closest species, then use Ensembl Plants to retrieve the phylogenetic group and add your sequence to extend the phylogeny

ADD COMMENTlink written 8 months ago by JC11k
0
gravatar for ricardoguerreiro2121
8 months ago by
Germany
ricardoguerreiro212160 wrote:

I think I have found my ideal answer:

Run exonerate

Then in Python:

qresult = SearchIO.parse("exonerate_outfile", 'exonerate-text')

for i in qresult:
    hsp = i[0][0]    

    print("".join(list(hsp.hit_all[0])))
ADD COMMENTlink written 8 months ago by ricardoguerreiro212160
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1517 users visited in the last hour