Question: aligning scaffolds to the reference genome
0
gravatar for arta
2.8 years ago by
arta540
Sweden
arta540 wrote:

Hi all,

I had chloroplast reads and assembled them using Abyss and got chloro-scaffolding.fa files which consists of many scaffolds. I have reference chloroplast genome as well. I would like to align scaffoldings into reference genome and annotate the genes. Do you recommend me tools to do that or workflow how to do? Currently i am working with exonerate however it seems i can not annotate genes. Also the output of exonerate is not clear to me, i do not know how to do them for the downstream analysis.

ADD COMMENTlink modified 2.8 years ago by apa@stowers440 • written 2.8 years ago by arta540
1
gravatar for apa@stowers
2.8 years ago by
apa@stowers440
Kansas City
apa@stowers440 wrote:

It sounds like you are assembling mRNA-seq reads into transcripts and trying to align these to produce gene models?

If so, I typically would run exonerate like: "exonerate -m est2genome --revcomp --bestn 1 --showcigar --showtargetgff -t chloroplast.fa -q scaffolds.fa > scaffolds.out 2> scaffolds.err", then extract the GFF lines from scaffolds.out. However, this will not generate CDS features in the GFF.

If CDS annotations are important, then first run ORF prediction on your sequences and produce a 4-column, space-delim file ("CDS.txt"), one row per scaffold, containing these 4 values: scaffold ID, strand of ORF (+ or -), scaffold ORF start (1-based), scaffold ORF end. With this file, you can run exonerate like this: "exonerate -m cdna2genome --annotation CDS.txt --revcomp --bestn 1 --showcigar --showtargetgff -t chloroplast.fa -q scaffolds.fa > scaffolds.out 2> scaffolds.err". The GFF will now include CDS lines.

Depending on the results, you may want to change default values for --refine, --minintron, --maxintron. Exonerate is a parameter jungle so you may find other useful ones, but these are what I typically use.

ADD COMMENTlink written 2.8 years ago by apa@stowers440

Thank you !! That will help a lot.

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by arta540
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1082 users visited in the last hour