Compare A Denovo Assembly To The Reference Genome
4
4
Entering edit mode
10.4 years ago
Rohit ★ 1.5k

Hello.

I was working on a pipeline for genome assembly and I have used the human paired-end NGS data.

I want to compare my denovo assembly to the available human genome which is based on Sanger sequencing mainly. How can I compare my denovo assembly to the sanger sequencing inorder to check -

  1. How much of the genome was covered in the denovo assembly?

  2. How many regions of the genome have I missed in the denovo contigs?

  3. Do I compare the contigs or scaffolds to the reference which is ordered as chromosomes?

I know Quast is used to test different genome assembly qualities, but what about comparing a denovo assembly to a reference genome???

comparison ngs denovo reference • 10k views
ADD COMMENT
5
Entering edit mode
10.4 years ago
cts ★ 1.7k

Quast also works with a reference genome, just set the reference with -R. According to their manual, Quast has tests that will give you answers to points 1 & 2. As for point 3, Quast has a --scaffolds option which, if given scaffolds will break them back into contigs. You could run it with and without the --scaffolds option to see the difference in the metrics. There is also ALE for looking at assembly metrics although I'm not very familiar with it

ADD COMMENT
0
Entering edit mode

I'm afraid Quast is not favorable for bigger genomes. Good for bacterial I guess.

ADD REPLY
2
Entering edit mode
10.4 years ago
lh3 33k

There are quite a few multi-genome aligners. Mauve is a recent one which is extensively used in assemblathon. Bwa-mem and bwa-sw can do pairwise alignment if you prefer SAM as the output. It seems to me that Quast is primarily designed for small genomes. It calls MUMmer to do the actual alignment. As I remember, MUMmer does not work with genomes longer than 512Mb and even if it does, it will require huge RAM to hold the suffix tree. I do not know if Quast is able to split the genome.

ADD COMMENT
1
Entering edit mode

Can BWA do pairwise alignment of two genomes? I thought it was used for reads to reference alignment

ADD REPLY
1
Entering edit mode

The first BWA algorithm only works with short reads. You can align bacteria genomes with bwa-mem and bwa-sw. You can also align ~1Mbp contigs to human genomes with them.

ADD REPLY
0
Entering edit mode

Quast uses Mummer and I don't think it splits the genome in anyway. It runs till you are out of memory, no matter how huge your RAM is (~800Gb).

The problem I had with Mauve was that I did not know how to use it from the command-line. Its best to figure out now.

Or test it with BWA, though I donno how well I can use the SAM output

ADD REPLY
1
Entering edit mode
ADD COMMENT
0
Entering edit mode
10.4 years ago
HG ★ 1.2k

You can have a look GAGE -B paper: Supplementary Table S13. for question 1 and for question 3 progressiveMauve may help you to find the answer.

ADD COMMENT

Login before adding your answer.

Traffic: 2080 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6