Question: Mapping Reads Back To Assembly Contigs
gravatar for David M
7.9 years ago by
David M550
David M550 wrote:

Does anyone know of a program which can efficiently map read data back to an assembly itself in a reasonable amount of time?

I'd like to visualize the support for the assembly of a given contig.

assembly contigs mapping read • 10k views
ADD COMMENTlink written 7.9 years ago by David M550

Burrows Wheeler Aligment? BWA?

ADD REPLYlink written 7.9 years ago by Zev.Kronenberg11k

Tablet and some other views support ACE and AFG, which should more accurately tell you what the assembler was thinking rather than inferring it through an alignment

ADD REPLYlink written 7.9 years ago by Jeremy Leipzig19k
gravatar for Ahdf-Lell-Kocks
7.9 years ago by
Ahdf-Lell-Kocks1.6k wrote:

There is one main consideration in trying to reproduce the coverage profile from an assembly of short reads:

  1. You do it during assembly without using information, or
  2. You do it post-assembly, possibly losing information.

Most assemblers for NGS data won't do it by default during assembly since it takes up more memory to keep track of the reads pileup during the assembly process. Then the option is to do it post-assembly, which is not perfect. What mainly makes it imperfect is that one will use a different tool to pileup the reads back to the contigs in a way that won't reproduce with 100% fidelity the decisions taken by the assembler during the process. Repetitive regions are one of the main sources of difference in this post-assembly pileup differences. Maybe other people know of other genomic features that will make this different.

BWASW as an example for short-read post-assembly pileup.

ADD COMMENTlink written 7.9 years ago by Ahdf-Lell-Kocks1.6k
gravatar for ALchEmiXt
7.9 years ago by
The Netherlands
ALchEmiXt1.9k wrote:

I agree that using directly the produced assembly support is prefered above an after-mapping.

There are quite some good tools for this. I suspect you want to have some idea of the assembly quality right? If you have paired-end or mate-pair data you should definitly try to look at so-called regions where the mapped/assemled reads significantly violate the expected distance between or mapped orietation of reads (eg. compressed regions).

A nice summary can be found in this document at CBCB describing the validation of assemblies quite nicely. For inspection they used AMOS assembly viewer.

ADD COMMENTlink written 7.9 years ago by ALchEmiXt1.9k

That's a useful document on assembly validation that I hadn't seen before. Thanks.

ADD REPLYlink written 7.9 years ago by David M550
gravatar for Nikolay Vyahhi
7.9 years ago by
Nikolay Vyahhi1.3k
St. Petersburg, Russia
Nikolay Vyahhi1.3k wrote:

First use BWA to align reads to contigs:

bwa index contigs.fasta
bwa aln -t NUMBER_OF_THREADS contigs.fasta short_reads.fastq > alignment.sai
bwa samse contigs.fasta alignment.sai short_reads.fastq > alignment.sam

If you have paired-end reads, use bwa sampe instead of bwa samse.

If you have long reads, use bwa bwasw instead of both bwa aln and bwa samse.

Alternatively, you can use Bowtie to make alignment.

Second use Tablet to visualize resulting mapping.

ADD COMMENTlink written 7.9 years ago by Nikolay Vyahhi1.3k

Hi, I just want to know if bwasw possibly works for raw paired-end reads containing ~45 millions of pair (length of 300bp) and a 800Mb assembled contigs set as reference. Does it take long time to run this kind of data size, or is there another tool which is more suitable? Thanks!

ADD REPLYlink written 4.5 years ago by pbigbig200
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2157 users visited in the last hour