Question: aligner for CORRECTED pacbio long reads
0
gravatar for cmo
2.1 years ago by
cmo30
United States
cmo30 wrote:

After error-correcting PacBio long reads using Illumina short-reads, what aligners are adept for aligning the corrected PacBio long reads against the genome?

Should I think of the corrected PacBio reads as just "long Illumina reads" (in terms of error & indel rate, etc.) ?

I am tempted to use BLASR, but the PacBio-specific error rates and indels are presumably "corrected out", so it still appropriate to use BLASR?

Is it appropriate to use BWA for the corrected PacBio long-reads?

Are some aligners more appropriate than others?

ADD COMMENTlink modified 6 weeks ago by Felix Francis340 • written 2.1 years ago by cmo30

CMO, your best bet for working with hybrid data (short read + long read) is to use a hybrid aware package like one of the two options below:

ECTools - https://github.com/jgurtowski/ectools

MHAP - PBcR - http://wgs-assembler.sourceforge.net/wiki/index.php/PBcR

The details of how hybrid data is correctly combined and then processed downstream is more complicated than most would expect. The long-and-short (pun intended) of it is that any tools written to process long-read data alone (BLASR) or short-read data alone( BWA etc.) are non-optimal.

ADD REPLYlink written 2.1 years ago by jrsmith0

Thank you, but I am more interested in how to align after the PacBio long reads are corrected.  I am not necessarily interested in a de-novo assembly.  And the correction step should be taken as given, I am not interested in correction methods.

ADD REPLYlink written 2.1 years ago by cmo30

Hi CMO,

I'm not a SME on the RS II so I ask Jason and he was kind enough to respond -- hope it was helpful...

ADD REPLYlink written 2.1 years ago by buchananbuck010
1
gravatar for rhall
2.1 years ago by
rhall150
United States
rhall150 wrote:

I would suggest using Blasr, even with default parameters the alignment of corrected reads should be high quality, and parameters could be altered to more optimally map low error long reads, although I'm not sure you would gain anything other than performance (speed). Another option would be blast, while bwa would probably work you are more likely to run into issues with performance and read length, particularly if your corrected reads are at the top end >40kb.

 

 

ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by rhall150
1

Bwa-mem works for this type of data.

ADD REPLYlink written 2.1 years ago by lh328k
0
gravatar for orange
22 months ago by
orange30
Korea, Republic Of
orange30 wrote:

pacbio suggest that use blasr or bwa . https://github.com/PacificBiosciences/Bioinformatics-Training/wiki/Evaluating-Assemblies

ADD COMMENTlink written 22 months ago by orange30
0
gravatar for Felix Francis
6 weeks ago by
Felix Francis340
United States/University of Delaware
Felix Francis340 wrote:

I would use bwa mem or BLASTn rather than blasr for better specificity. I have had experience with mapping error corrected PacBio reads using blasr and some of the best hits were incorrect.

ADD COMMENTlink modified 6 weeks ago • written 6 weeks ago by Felix Francis340
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 681 users visited in the last hour