Question: Genome assembly with RNA-seq data?
0
gravatar for Buffo
4.0 years ago by
Buffo1.8k
Buffo1.8k wrote:

Hi everybody, I have reads from transcriptome and the reference genome, but the reference is fragmented (more than 20k contigs), Somebody knows if is possible to improve the genome assembly using the RNA-SEQ reads? I`m sure that it is not possible with IDBA, SOAP and SPADES Thanks!

ADD COMMENTlink modified 7 months ago • written 4.0 years ago by Buffo1.8k

You may be able to improve local regions with RNAseq reads but chances of finding spanning reads (to bridge contigs, assuming you have paired-end transcriptome reads) would be smaller. Even if you do find them there would be no way to fill the sequence in between.

ADD REPLYlink written 4.0 years ago by genomax87k

Since the reference is that fragmented I guess it can help bridging contigs, correctly orientating them, but indeed the sequence in between will not be filled in.

ADD REPLYlink written 4.0 years ago by WouterDeCoster44k
1
gravatar for h.mon
4.0 years ago by
h.mon30k
Brazil
h.mon30k wrote:

Some papers on the subject: one, two, three.

ADD COMMENTlink written 4.0 years ago by h.mon30k

Thank you so much :-)

ADD REPLYlink written 4.0 years ago by Buffo1.8k
1
gravatar for Buffo
7 months ago by
Buffo1.8k
Buffo1.8k wrote:

I had forgotten this question, finally I used Pilon for polishing my draft assembly using paired-end RNA-seq sequences. It works very well for me.

ADD COMMENTlink written 7 months ago by Buffo1.8k
1

will not argue on that point but with Pilon you will not increase the overall quality of your assembly, only the per-base resolution.

Apart from the suggestion h.mon has already provided, I can add ABySS, it also has a scaffolding mode using transcript info.

ADD REPLYlink modified 7 months ago • written 7 months ago by lieven.sterck8.2k
1

From Pilon;

> Pilon uses read alignment analysis to identify inconsistencies between the input genome and the evidence in the reads. It then attempts to make improvements to the input genome, including:
> Single base differences
> Small indels
> Larger indel or block substitution events
> Gap filling

Do you consider those improvements are not an increase of quality? really?

ADD REPLYlink modified 7 months ago • written 7 months ago by Buffo1.8k
1

yes and no.

of course it improves the quality of the assembly but not in an improved scaffolding way, as in the overall stats of the assembly should not change much.

Moreover: Pilon should be used with DNA-seq reads, not RNAseq reads! (might work to some extent but I think you could also get some unexpected results. Just keep that in mind)

Having re-read the question it does indeed not specifically mentions that it should be on the overall assembly quality but that's how I (and others apparently) have interpreted it.

ADD REPLYlink modified 7 months ago • written 7 months ago by lieven.sterck8.2k

Moreover: Pilon should be used with DNA-seq reads, not RNAseq reads!

Why not? If you move from hsa or mm there are lot of eucaryotes without splicing events (Just keep that in mind), in all those cases is validto use Pilon. Have you ever used Pilon? Before using pilon I tried at least 5 or 6 pipelines for scaffolding from RNAseq reads (including reccomended on this post), none of those worked, just Pilon cause significative improvements to my assembly that were succesfully validated biologically. However, whit these results I can not generalize that all those softwares does not work for scaffolding just because they failed in my project, I suggest you to do the same, it works for polishing assemblies using RNAseq reads in species free of splicing events (yes, including some eukaryotes). Best,

ADD REPLYlink modified 7 months ago • written 7 months ago by Buffo1.8k
1

Pilon will not do scaffolding in the true sense of the word! I stick to that!

about using RNAseq, yes it will work in principle but I would be very sceptical about the results as RNAseq is NOT the intended input for Pilon. In the rare case of non-spliced transcipts, yes it will more or less resemble DNAseq but in all other cases (== the majority) it will absolutely not.

Imagine you do gapfilling with RNAseq, and the gap it tries to fill contains intron and part of exon, then by using RNAseq it will completely fill the gap with what it has as exon sequence from the RNAseq, and thus omitting the intron that should be there.

ADD REPLYlink written 7 months ago by lieven.sterck8.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 827 users visited in the last hour