Yeast reference-based genome assembly
2
0
Entering edit mode
20 hours ago
amy967107 • 0

Hello everyone!

I'm trying to perform a reference-based genome assembly using Nanopore sequencing data. The gene originates from yeast, presumably from the genus Rhodotorula, and I currently have two candidate reference genomes that could be used. Could anyone recommend a reference-based assembly pepline or tool suitable for yeast?

I've seen potential tools like LRSDAY, which is best suited for Saccharomyces cerevisiae and may require some adjustments, and RGAAT. Would either of these be suitable? After assembling the gene, I'll need to annotate it and use it for subsequent prediction of functional enzymes.

Thanks in advance for your help!

Reference nanopore yeast genome Assembly • 2.1k views
ADD COMMENT
0
Entering edit mode

The gene originates from yeast,

What does this mean? Are you putting a gene from a different genus/species into some other yeast? Are you only looking to assemble just the gene of interest or entire genome? It would help if you can describe what you are trying to do in more detail.

ADD REPLY
0
Entering edit mode

This is a yeast strain (probably Rhodotorula spp.) and I'm trying to assemble the entire genome from nanopore reads using a reference-based approach.

ADD REPLY
0
Entering edit mode

So, you aren't sure about the exact species? Then you might want to be careful with reference-based assembly in general. I recommend the following steps:

  • Generate a few assemblies with Flye and Hifiassm-ONT using different settings
  • Possibly also filter the read to min length 10kb and see if that improves things.
  • Check different quality parameters
  • Check, how well the assemblies align to the references. Possibly, use marker genes to determine the species more precisely
  • Run the assemblies through RagTag for scaffolding (optional) with the reference genomes

You need to be aware that RagTag can make your assembly look almost identical to the reference, even if it isn't.

ADD REPLY
2
Entering edit mode
20 hours ago

Nanopore reads are long enough to create a full assembly. Why not first try to create a whole genome assembly with a tool like Flye if you have enough coverage 20x+ ?

Then look for you gene in the assembly contigs using blast etc. Please list your data volume, expected coverage etc or file sizes so we can better help you.

ADD COMMENT
0
Entering edit mode

I have ~2.01 Gb of ONT data and the target genome is ~20–23 Mb (90–100× coverage). My ONT read N50 is ~3.8 kb, so contiguity may be limited. Would you still recommend a de novo assembly with Flye under these conditions?

ADD REPLY
0
Entering edit mode
11 hours ago
Michael 56k

The new hifiassm ONT may be all you need for getting a good assembly. I would not recommend LRSDAY, except for its annotation part (which is specialized for S. cerevisiae), because the sequencing and assembly technology has moved on since these pipelines were implemented. You could use RagTag to bring some reference-based scaffolding or sequence correction to the table. But the genomes you can achieve with recent ONT technology may be as good or even better than the reference genomes you have.

ADD COMMENT

Login before adding your answer.

Traffic: 4836 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6