Question: retro-engineered annotation on genome assembly
gravatar for guillaume.rbt
4.0 years ago by
guillaume.rbt830 wrote:

Hi everyone,

I'm working on a fungus species, on which I have two genome assemblies, performed on two different strains, and also one annotation for each assembly.

By crossing the annotations peptide sequences results, with BDBH analysis, I get some common proteins, and some specific to each strain.

I know, by blasting them on genomes assembly, that most of the "strain specifics" genes are however also present on the other strain genome. (certainly due to the different annotation software)

What I would like to do is to retrieve the sequences of one strain specifics genes on the other strain genomes, so that I complete the annotation.

Would anybody have a clue on how doing such a thing?


ADD COMMENTlink modified 4.0 years ago by Bill Pearson860 • written 4.0 years ago by guillaume.rbt830
gravatar for Bill Pearson
4.0 years ago by
Bill Pearson860
Bill Pearson860 wrote:

A possible strategy:

(1) blastp all of fungus1 vs fungus2 and vice versa. Find the proteins in fungus1 that do not have significant hits (possibly with a percent identity and coverage threshold) in fungus2, or have hits that only cover part of the protein, and vice-versa.

(2) take the proteins in fungus1 and fungus2 that do not have a match in the other fungus, and tblastn (tfastx) them against the other fungal assembly. I would expect that many of the fungal proteins that do not match, or match only partially, will be found by the tblastn (tfastx) search. tfastx will be slower but much less sensitive to frameshift errors in the assembly.

ADD COMMENTlink written 4.0 years ago by Bill Pearson860

thank you Bill for your answer, I will try that

ADD REPLYlink written 4.0 years ago by guillaume.rbt830
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1585 users visited in the last hour