Question: Aligning Miseq With Deletions >10 Bases
2
gravatar for Leszek
6.9 years ago by
Leszek4.0k
IIMCB, Poland
Leszek4.0k wrote:

We expect deletions (10-1000 bases) in the genome of interest, therefore, we run paired MiSeq 300bp+200bp. Can you recommend me the best aligner for such dataset?
In theory, some programs should handle it right:

  • bowtie2 supports gapped alignments in end-to-end mode, but in practice I haven't seen big indels
  • bwa aln - there are -o and -e parameters, anyone tried it?
  • bwa bwasw - align reads with deletion twice: up- and downstream from the deletion, the other part of the read is clipped.
  • blat - slow and reports multiple suboptimal hits, plus no SAM output
  • any other recommendations?

What is your opinion?

EDIT
Just to make it clear, pairs are overlapping, after merging with FLASH I get single reads ~400 bases.

example:


@M00724:10:000000000-A2EKA:1:1101:16915:1666 1:N:0:1
TTATCTTGAAAAAATGCACCCGCAGCTTCGCTCGTAATCCGAAAACACGGGAAATGGAGTCAGGCTTTTTTTATGGATGAGAAAATAGACACCAAAGAAGCCTTCATATGACCTTAACGGACCTACAATGCAAAAAGTTATCAAAAGACTGCATTATTGAGCACACATAGTACTAATAAAGTAATCTAAGAAGCGTGGTTAGAAAGATAGCGATCTCGTGATGCATTTTTGTAGAACAAAAAAGAATTAGAGATTCTTTGTTGGTAAAATAGCTCTCTCGCGTTGCATTTCTGTTCTGTAAAAATGCAGATCTGATTCTTTGTATGAACAAGTAGCGCTCTCGCGTTGCATTTTTGTTCTACAAAATGAAGCAAAGATGCTTCGTTAACAAAGATATGCTATTGAAGTGCAAGTT
+
5<=,<>>/----5+<-ACC8+>+>>+8C.++7++>+,78.++55>-+,5**+5+------5----5-5CE))4=+4++4+4+4++++44@+11*1***3**;38@*1****11*1*0191:)-./6?(666(6?6(-(/(.((6((./((((../66((((/((./((.(./((/(((((//6<ee;?=(6((.' ('="" (="" (="" .6(6<(.(''(="" 66##(="" ;6;6ee="?;(&lt;;###./(///#(#(#(/(6#6(;..(&lt;=&lt;;/6/6&lt;#6///(#)98/#8*@23*#&lt;*;3#?&lt;4#24+&lt;#4+++C6++4++++6++++D=C56,,,56+-A)DC5*5***C*C+E-FCC7+,EEEEAD9A7...99.AD.A.9.A-9-">+A7+9/AEA8//AC/C@C;>9<-5-5-->
should align to 2-micron plasmid of S. cerevisiae with one deletion of 125 bases (blat .psl output)
380    35    0    0    0    0    1    125    +    M00724:10:000000000-A2EKA:1:1101:16915:1666    415    0    415    2-micron    6318    3937    4477    2    59,356,    0,59,    3937,4121,
alignment next-gen miseq • 2.1k views
ADD COMMENTlink modified 6.4 years ago by Biostar ♦♦ 20 • written 6.9 years ago by Leszek4.0k
1

Try: "bwa bwasw -b9 -q16 -r1 -w200". Let me know what you get. Thanks.

ADD REPLYlink modified 6.9 years ago • written 6.9 years ago by lh331k
1

I will be curious to see how BWA works on such long reads.

ADD REPLYlink written 6.9 years ago by Mathew Bunj40
1

It is bwa-sw, not bwa. Bwa-sw is known to work with 150kb BACs and even 1Mbp sequences.

ADD REPLYlink written 6.9 years ago by lh331k
1

BWA-SW does not work with PE? Unless it is being used after joining reads with FLASH?

ADD REPLYlink written 6.9 years ago by kanwarjag1.0k
1

It has been working with PE for a year or so.

ADD REPLYlink modified 6.9 years ago • written 6.9 years ago by lh331k
1

it splits correct alignments obtained with default parameters. the reads with deletions are aligned twice: upstream and downstream from the deletion, the other part of the read is clipped.

ADD REPLYlink written 6.9 years ago by Leszek4.0k
1

That is exactly why I was asking you to use the non-default parameters. You will get longer deletions, though not longer than half of read length. As you are working on small genomes, why not de novo assemble your reads first?

ADD REPLYlink written 6.9 years ago by lh331k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1122 users visited in the last hour