Question: How can I transfer gene models to a new assembly?
0
gravatar for O.rka
7 months ago by
O.rka210
O.rka210 wrote:

Here's my data:

sample_A: Canonical assembly with gene models (sample_A.fasta, sample_A.gff3)

sample_B: Mutant and de-novo assembly. No gene models (sample_B.fasta)

I want to transfer the gene models from sample_A to sample_B.

I thought this would be straightforward but it's definitely not. There are some instances where exon_2 comes before exon_1 or where a particular exon maps multiple times on the de-novo assembly.

Is there a tool that will do this? Ideally, I would like a tool that does the following:

program --ref_assembly sample_A.fasta --ref_annotations sample_A.gff3 --query_assembly sample_B.fasta --percent_identity 0.98 > sample_B.gff3

Here is an example of a unique edge case when I've mapped the exons from transcript FUN_000463-T1(from sample_A.gff3 and sample_A.fasta) to the new assembly (sample_B.fasta). Notice the exon ordering: enter image description here

Here's the left side zoomed in:

Here's the right side zoomed in:

Notice the exon ordering.

assembly gene • 280 views
ADD COMMENTlink modified 7 months ago • written 7 months ago by O.rka210
1

You can try RATT. Success will depend on quality of your assemblies.

ADD REPLYlink written 7 months ago by genomax91k

Thank you. I'm looking at it right now and it's pretty confusing to run. https://vcru.wisc.edu/simonlab/bioinformatics/programs/ratt/Documentation.html I installed with conda but it appears a lot of the files aren't there. I also found this tutorial: http://avrilomics.blogspot.com/2013/02/using-ratt-to-transfer-gene-predictions.html

Do you know of any other tools for this? I've heard of liftover but there is little documentation on using with a new organism.

ADD REPLYlink modified 7 months ago • written 7 months ago by O.rka210

I've updated my question a bit to be more specific.

ADD REPLYlink written 7 months ago by O.rka210
3
gravatar for Juke34
7 months ago by
Juke344.8k
Sweden
Juke344.8k wrote:

There is a list of tool in table5 of this publication: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6450745/

If you need the transcripts you just extract them from your GFF e.g with AGAT:
agat_sp_extract_sequences.pl -g infile.gff -f infile.fasta --cdna

Not listed in the publication you can also use MAKER. See basic protocol4 MAPPING ANNOTATIONS TO A NEW ASSEMBLY Genome in Genome Annotation and Curation Using MAKER and MAKER-P

ADD COMMENTlink modified 6 months ago • written 7 months ago by Juke344.8k

Thank you for the suggestions! I will continue to look through these. It looks like "CESAR" is the most modern out of all of the tools (2016). I've had issues running older tools that haven't been maintained in a while. I'm looking at "transMap" right now but it's a bit confusing. So is transMap a part of https://github.com/ComparativeGenomicsToolkit/Comparative-Annotation-Toolkit ? I haven't seen any tutorials describing how to do this exactly. I'm a bit new to these suites as I'm more familiar with funannotate.

ADD REPLYlink written 7 months ago by O.rka210
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1635 users visited in the last hour