producing a combined .sam file using *_1.sai and *_2.sai
0
0
Entering edit mode
3.9 years ago
br.tania ▴ 40

Hi everyone!

I am aligning whole genome sequence paired reads for which I used the following commands:

bwa index ref.fa

bwa aln ref.fa sample_1.fa > sample_1_aln.sai

bwa aln ref.fa sample_2.fa > sample_1_aln.sai

bwa sampe ref.fa sample_1_aln.sai sample_2_aln.sai sample_1.fa sample_2.fa > sample_aln.sam

The last command gives the following error:

[E::bwa_sai2sam_pe_core] Unmatched SAI magic. Please re-run aln' with the same version of bwa.

I looked around - several other people seem to have come across this error. But was unable to find a solution to resolve this problem.

I am doing test runs and using only one chromosome as the reference sequence.

I will really appreciate if someone can help out.

I also tried the following on the same data:

bwa mem -M ref.fa sample_1.fa sample_2.fa > sample_aln.sam

The output I get is the following:

[M::process] read 66668 sequences (10000200 bp)...

[M::process] read 66668 sequences (10000200 bp)...

[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 210, 3, 1)

[M::mem_pestat] skip orientation FF as there are not enough pairs

[M::mem_pestat] analyzing insert size distribution for orientation FR...

[M::mem_pestat] (25, 50, 75) percentile: (235, 295, 323)

[M::mem_pestat] low and high boundaries for computing mean and std.dev: (59, 499)

[M::mem_pestat] mean and std.dev: (276.51, 65.35)

[M::mem_pestat] low and high boundaries for proper pairs: (1, 587)

[M::mem_pestat] skip orientation RF as there are not enough pairs

[M::mem_pestat] skip orientation RR as there are not enough pairs

[mem_sam_pe] paired reads have different names: "sample.1.1", "sample.1.2"

What am I doing wrong?

Thanks.

BWA sampe error • 2.6k views
1
Entering edit mode

Looks like there's something wrong with your input files. Can you paste perhaps a small section of the beginning and end of your .fa and .sai files?

0
Entering edit mode

@ jrj.healey:

I am pasting parts of a .fa and .sai files.

sample_1.fa: beginning

@SRR1767736.1.1 ST-E00144:17:H04D6ALXX:1:1101:11221:1309 length=151
TCTTAAGAAATAAATGATAAATGTTTGAGATGATGGATACTCCAGTTACTTAGATTGAATCATTATACATTGTATGTGTGTATCAAAATTTCACATGTGGGCTTGAGATGGTAGTTCTGGCCTGTAATCTCAGCACTTTGGAAGAATAAGG
+SRR1767736.1.1 ST-E00144:17:H04D6ALXX:1:1101:11221:1309 length=151
AAA<-AFJJFJJ<J77AA7<AFA--J-F---<A<FJA<A-J-<JJ<JJJJJJ-FAJAF-FFFJJJJJAFJAJJJJ-FFA<FJF--FAF-<A--7A<AF7FA---A-FAA-FJFJ---<-----A<FJFA-7FJ-<-JAFJ7-F<----<<A
@SRR1767736.2.1 ST-E00144:17:H04D6ALXX:1:1101:11749:1309 length=151
AAAAATCTTCTTCTGCTAAACATAAGTAATGCCTGCTATAATTTCATCTGATATTTCTTCTAAAACACCCACATATTACATGTTCCCTTTAACGAAAAACAAAAGAAACAGATAACAAATAATCACCAGTAATACCACTACTCTATTAAAA
+SRR1767736.2.1 ST-E00144:17:H04D6ALXX:1:1101:11749:1309 length=151
AA<<AJJJJF<J-JJ7JFFAFJJJ<FJJJJJ7JJJJFJJ7FFJ<JAJ-JAFFFJFJ-7AFJFJFJ-JA-AJA<FFF-J<JJJFJ-<-FFFFA-AJ7FF----F--FFF-J-F-A-<JJJ7<FAAJ--FA-<FJJFFJ<FA-A-<--<7<JA


sample_1.fa: end

@SRR1767736.46491804.1 ST-E00144:17:H04D6ALXX:1:2224:29816:73318 length=151
TGTGGCAAAAGGGCTCCTGGATGTCCCTGTAGAAAGGGCAACAGCACCTTCCACTAAAATAACAGGTCTCAGAGCGGCAACACCCTCGGAACAACTAGGCAGGGAAACCCGGGTCTTCAGATGGTAGACAAATGCAGAGGGAAAGATGTGN
+SRR1767736.46491804.1 ST-E00144:17:H04D6ALXX:1:2224:29816:73318 length=151
AF----A<<AFAA<FFF--7<--AF-<AA-J<FJJ---AJFJFJJAJJ--J-J-7FAFJAAJF<JF--FFFJJF<-JF<FAFJF7FFF7JJF<JFAF--FAFFF7JJAFFJF-JJJJJ<JF-<JJJFFFJ-<-<FFFFA<AJF<AF7<AJ#


sample_1.sai: beginning

\5\5H#I#<????f  ?f      q6q6    66?|?|?|?|????  q6q6?
?
?
?

????????@??
?=?=
?B?B?B?B?t?tztzt?????_?_?_?_?_?_?_?_?t?tztzt??
??
?,?,?,?,?n?n....?????%?% ?%?%?F?F$F$FC?C?@?@??z?zs?s?(?(?)?)?'?'?*?*?,?,???g?ge?e?
<@<@?:?:   ?:?:????    tt
vv    Y?  Y?      \?  ]?      Z?  Z?


sample_1.sai: end

?4                                  ???n?n  ?n?n    ?4
?4
?n?n    ?n?n    ????    BB
?\?\?E?E?E?E@,?,???????W?W?V?V?kZkZ????
??????g?g?   U?
U?
T?
T?

V?
V?
????????    **  W?W?{|  +?,?
-?-?
`

I hope this helps - sorry I am new to this.

0
Entering edit mode

Your formatting is all over the place.. it's pretty unintelligible.

Can you edit your original post to add the files there (a bit clearer than the comments) and use the code format button please? (Highlight the bit you want displayed as code - including your input data - and press the button above that says '101010' )

0
Entering edit mode

I just re-formatted it. Thanks for your patience.

0
Entering edit mode

I don't know if this will be directly related to your problem, but have you done any quality control/analysis on the reads yet? It looks to me quickly, by eye, that you have quite a lot of low confidence base calls.

I'm no expert with FASTQ format, so someone else might have to weigh in here.

0
Entering edit mode

I haven't done any quality control. I assumed I was given good quality fastq files - I also have access to the .sra files if needed. I'm not sure if low confidence base calls will result in the errors I mention.

Can someone with more experience chip in?

Thanks.