Question: producing a combined .sam file using *_1.sai and *_2.sai
0
gravatar for br.tania
2.3 years ago by
br.tania40
br.tania40 wrote:

Hi everyone!

I am aligning whole genome sequence paired reads for which I used the following commands:

bwa index ref.fa

bwa aln ref.fa sample_1.fa > sample_1_aln.sai

bwa aln ref.fa sample_2.fa > sample_1_aln.sai

bwa sampe ref.fa sample_1_aln.sai sample_2_aln.sai sample_1.fa sample_2.fa > sample_aln.sam

The last command gives the following error:

[E::bwa_sai2sam_pe_core] Unmatched SAI magic. Please re-run `aln' with the same version of bwa.

I looked around - several other people seem to have come across this error. But was unable to find a solution to resolve this problem.

I am doing test runs and using only one chromosome as the reference sequence.

I will really appreciate if someone can help out.

I also tried the following on the same data:

bwa mem -M ref.fa sample_1.fa sample_2.fa > sample_aln.sam

The output I get is the following:

[M::bwa_idx_load_from_disk] read 0 ALT contigs

[M::process] read 66668 sequences (10000200 bp)...

[M::process] read 66668 sequences (10000200 bp)...

[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 210, 3, 1)

[M::mem_pestat] skip orientation FF as there are not enough pairs

[M::mem_pestat] analyzing insert size distribution for orientation FR...

[M::mem_pestat] (25, 50, 75) percentile: (235, 295, 323)

[M::mem_pestat] low and high boundaries for computing mean and std.dev: (59, 499)

[M::mem_pestat] mean and std.dev: (276.51, 65.35)

[M::mem_pestat] low and high boundaries for proper pairs: (1, 587)

[M::mem_pestat] skip orientation RF as there are not enough pairs

[M::mem_pestat] skip orientation RR as there are not enough pairs

[mem_sam_pe] paired reads have different names: "sample.1.1", "sample.1.2"

What am I doing wrong?

Thanks.

bwa sampe error • 1.7k views
ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by br.tania40
1

Looks like there's something wrong with your input files. Can you paste perhaps a small section of the beginning and end of your .fa and .sai files?

ADD REPLYlink written 2.3 years ago by jrj.healey13k

@ jrj.healey:

I am pasting parts of a .fa and .sai files.

sample_1.fa: beginning

@SRR1767736.1.1 ST-E00144:17:H04D6ALXX:1:1101:11221:1309 length=151
TCTTAAGAAATAAATGATAAATGTTTGAGATGATGGATACTCCAGTTACTTAGATTGAATCATTATACATTGTATGTGTGTATCAAAATTTCACATGTGGGCTTGAGATGGTAGTTCTGGCCTGTAATCTCAGCACTTTGGAAGAATAAGG
+SRR1767736.1.1 ST-E00144:17:H04D6ALXX:1:1101:11221:1309 length=151
AAA<-AFJJFJJ<J77AA7<AFA--J-F---<A<FJA<A-J-<JJ<JJJJJJ-FAJAF-FFFJJJJJAFJAJJJJ-FFA<FJF--FAF-<A--7A<AF7FA---A-FAA-FJFJ---<-----A<FJFA-7FJ-<-JAFJ7-F<----<<A
@SRR1767736.2.1 ST-E00144:17:H04D6ALXX:1:1101:11749:1309 length=151
AAAAATCTTCTTCTGCTAAACATAAGTAATGCCTGCTATAATTTCATCTGATATTTCTTCTAAAACACCCACATATTACATGTTCCCTTTAACGAAAAACAAAAGAAACAGATAACAAATAATCACCAGTAATACCACTACTCTATTAAAA
+SRR1767736.2.1 ST-E00144:17:H04D6ALXX:1:1101:11749:1309 length=151
AA<<AJJJJF<J-JJ7JFFAFJJJ<FJJJJJ7JJJJFJJ7FFJ<JAJ-JAFFFJFJ-7AFJFJFJ-JA-AJA<FFF-J<JJJFJ-<-FFFFA-AJ7FF----F--FFF-J-F-A-<JJJ7<FAAJ--FA-<FJJFFJ<FA-A-<--<7<JA

sample_1.fa: end

@SRR1767736.46491804.1 ST-E00144:17:H04D6ALXX:1:2224:29816:73318 length=151
TGTGGCAAAAGGGCTCCTGGATGTCCCTGTAGAAAGGGCAACAGCACCTTCCACTAAAATAACAGGTCTCAGAGCGGCAACACCCTCGGAACAACTAGGCAGGGAAACCCGGGTCTTCAGATGGTAGACAAATGCAGAGGGAAAGATGTGN
+SRR1767736.46491804.1 ST-E00144:17:H04D6ALXX:1:2224:29816:73318 length=151
AF----A<<AFAA<FFF--7<--AF-<AA-J<FJJ---AJFJFJJAJJ--J-J-7FAFJAAJF<JF--FFFJJF<-JF<FAFJF7FFF7JJF<JFAF--FAFFF7JJAFFJF-JJJJJ<JF-<JJJFFFJ-<-<FFFFA<AJF<AF7<AJ#

sample_1.sai: beginning

\5\5H#I#<????f  ?f      q6q6    66?|?|?|?|????  q6q6?
?
    ?
?

????????@??
           ?=?=
               ?B?B?B?B?t?tztzt?????_?_?_?_?_?_?_?_?t?tztzt??
                                                             ??
                                                               ?,?,?,?,?n?n....?????%?% ?%?%?F?F$F$FC?C?@?@??z?zs?s?(?(?)?)?'?'?*?*?,?,???g?ge?e?
                                                                                                                                                 <@<@?:?:   ?:?:????    tt
          vv    Y?  Y?      \?  ]?      Z?  Z?

sample_1.sai: end

?4                                  ???n?n  ?n?n    ?4
?4
?n?n    ?n?n    ????    `B`B
                            ?\?\?E?E?E?E@,?,???????W?W?V?V?kZkZ????
                                                                   ??????g?g?   U?
                                                                                  U?
                                                                                        T?
                                                                                          T?

                                                                                            V?
                                                                                              V?
                                                                                                    ????????    **  W?W?{|  +?,?
                                                                                                                                            -?-?

I hope this helps - sorry I am new to this.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by br.tania40

Your formatting is all over the place.. it's pretty unintelligible.

Can you edit your original post to add the files there (a bit clearer than the comments) and use the code format button please? (Highlight the bit you want displayed as code - including your input data - and press the button above that says '101010' )

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by jrj.healey13k

I just re-formatted it. Thanks for your patience.

ADD REPLYlink written 2.3 years ago by br.tania40

I don't know if this will be directly related to your problem, but have you done any quality control/analysis on the reads yet? It looks to me quickly, by eye, that you have quite a lot of low confidence base calls.

I'm no expert with FASTQ format, so someone else might have to weigh in here.

ADD REPLYlink written 2.3 years ago by jrj.healey13k

I haven't done any quality control. I assumed I was given good quality fastq files - I also have access to the .sra files if needed. I'm not sure if low confidence base calls will result in the errors I mention.

Can someone with more experience chip in?

Thanks.

ADD REPLYlink written 2.3 years ago by br.tania40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 542 users visited in the last hour