extracting genes from highly fragmented genome
0
0
Entering edit mode
10.2 years ago
celesty • 0

Hello,

I sequenced a genome at very low coverage, so the de novo assembly is really fragmented, N50 is around 5,000, with the top contig length ~110K. The species genome size is around 1.4G.

There is a distantly related species (>160 million years) with annotated genome, so I want to use it to identify homologous genes on my de novo assembly. I thought about two ways:

I can get the ensemble protein sequences from the distantly-related species, and use TBLASTN to align it to my genome, but my genome sequences is really fragmented, so one gene might have different parts on different contigs, and may be missing parts. I don't know how to solve.

Or I can try to align my assembled contigs to the genome, kind of putting those contigs in place, and then get the genes, but I am not very familiar with what tools should be used for genome alignment between these distantly related species.

Any thoughts?

Thank you!

alignment genome gene next-gen • 2.0k views
ADD COMMENT
0
Entering edit mode

I guess they are too distantly related for a contig to genome alignment. I would go for mapping annotated peptides to your contigs. I have used Spaln for this and worked great, by my pair of species were more closely related.

ADD REPLY

Login before adding your answer.

Traffic: 4419 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6