Question: Reporting Gapped Alignments With Bwa
2
gravatar for Gail
9.6 years ago by
Gail20
Gail20 wrote:

Dear all,

in an effort to map RNA-seq reads directly onto the genome I set the gap opening and gap extension parameters of bwa both to 0 (-O0 and -E0) in order to allow for exon junctions. However, bwa still reports not only perfect matches but also distance 1 matches rather than gapped alignments. Can this behaviour be changes somehow? I want gapped and ungapped alignments (in case both are without mismatches) to be treated equal.

Any help would be greatly appreciated!

Gail

ADD COMMENTlink modified 4.7 years ago by Biostar ♦♦ 20 • written 9.6 years ago by Gail20

Have you tried setting gap opening non-zero and gap extension zero? If both are 0, this might lead to unpredictable results.

ADD REPLYlink written 9.6 years ago by Michael Schubert7.0k

Dear Michael,

Thank you for the reply! I tried now: ./bwa aln -e1000 -O1 -E0 together with: ./bwa samse -n1000 but still don't get the gapped alignments!?

ADD REPLYlink written 9.6 years ago by Gail20

Can you provide an example of a few reads and the reference sequence, so one could test it a bit, please? Find some reads that should give gapped alingments make an example file available.

ADD REPLYlink written 9.6 years ago by Michael Dondrup47k

Hello Michael,

these are the test genome sequences:

GENOME_ori1 AACCTTGGAAAAAACCTTGGCCCCAACCTTGGTTTTAACCTTGGGGGGG GENOME_ori2 tAACCTTGGAAAAAACCTTGGCCCCAACCTTGGTTTTAACCTTGGGGGGG GENOME_ori3 ttAACCTTGGAAAAAACCTTGGCCCCAACCTTGGTTTTAACCTTGGGGGGG GENOME_insert AACCTTGGAAAAAAaaaCCTTGGCCCCAACCTTGGTTTTAACCTTGGGGGGG GENOME_1a AACCTTGGAAAAAACCTTGGCCCCAACCTTGGTTTTAACCTTGGGGGGa

And this is the test read:

Ori AACCTTGGAAAAAACCTTGGCCCCAACCTTGGTTTTAACCTTGGGGGGG

So far bwa always reports matches to GENOME_ori1-3 and GENOME_1a but not GENOME_insert. Would be great if you could figure out why that is!

ADD REPLYlink written 9.6 years ago by Gail20
1
gravatar for Michael Dondrup
9.6 years ago by
Bergen, Norway
Michael Dondrup47k wrote:

Could it be that you have to set other options?

From the documentation of bwa:

OPTIONS:
[...]
-o INT  Maximum number of gap opens [1]
-e INT  Maximum number of gap extensions, -1 for k-difference mode (disallowing long gaps) [-1]

If you look at the defaults for these, it seems like only one short gap would be allowed, so try with increasinf -o and -e. Otherwise I support Ketil's answer.

ADD COMMENTlink written 9.6 years ago by Michael Dondrup47k

I just saw, your comment that you use -e1000 already, but still that allows for only one gap.

ADD REPLYlink written 9.6 years ago by Michael Dondrup47k

Hello Michael,

Many thanks for your help! But it doesn't even report/find the ones which are supposed to have only one gap :)

Best wishes!

ADD REPLYlink written 9.6 years ago by Gail20
0
gravatar for Ketil
9.6 years ago by
Ketil4.0k
Norway
Ketil4.0k wrote:

I think the Augustus genome annotation pipeline suggests using BLAT to map RNAseq data exactly for this reason, BLAT is fairly good at mapping RNA to DNA. This will of course require you to rework the rest of your analysis, but converting BLAT output (PSL format) to SAM shouldn't be very hard.

If your data are too large to handle comfortably with BLAT, perhaps you can first map with BWA, and then remap just the reads that don't map in their (almost) entirety using BLAT.

ADD COMMENTlink written 9.6 years ago by Ketil4.0k

Dear Ketil,

many thanks! sound like a great idea!

ADD REPLYlink written 9.6 years ago by Gail20

samtools has a tool samtools/misc/psl2sam.pl for this

ADD REPLYlink modified 11 months ago by RamRS30k • written 5.3 years ago by Maximilian Haeussler1.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2037 users visited in the last hour