Question: Errors in sam file after bwa SAMPE alignment.
0
gravatar for kirannbishwa01
3.1 years ago by
United States
kirannbishwa01950 wrote:

I aligned my fasta files to the reference genome using bwa sampe.

I have paired end reads so first used:

bwa aln -t 2 -l 20 lyrata_genome.fa Trm_MA605_R1.fastq > Trm_MA605_R1.sai
bwa aln -t 2 -l 20 lyrata_genome.fa Trm_MA605_R2.fastq > Trm_MA605_R2.sai

Then I did the sampe alignment using:

bwa sampe -n 1 -N 2 lyrata_genome.fa Trm_MA605_R1.sai Trm_MA605_R2.sai Trm_MA605_R1.fastq Trm_MA605_R2.fastq > SAMPEaligned_MA605readsBWA.sam

 

I tried to sort the alignment and convert it to bam using picard. But, getting the error message (part of the total error message):

Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing text SAM file. MAPQ should be 0 for unmapped read.; File SAMPEaligned_MA605readsBWA.sam; Line 673702
Line: HWI-ST588:81:C080WACXX:7:1102:6980:187157    69    4    23328260    6    84M    6    963719    0    GGTTTAGGGTTTAGGGTTTAGGGTTTAGGGTTTAGGGTTTAGGGTTTAGGGTTTAGGGTTTAGGGTTTAGGGTTTAGGCATAAA    +2AAEEEIIFFIFIEI<CEDCEDDDDDIIIBDDEIDDBDDEIIDCCDIIII=ADDDDD?@DAAAAA>?AAAA>??AAA35(54>    RG:Z:MA605_C080WACXX_7    XT:A:U    NM:i:3    SM:i:6    AM:i:6    X0:i:1    X1:i:55    XM:i:3    XO:i:0    XG:i:0    MD:Z:28C49A1A3
    at htsjdk.samtools.SAMLineParser.reportErrorParsingLine(SAMLineParser.java:439)
    at htsjdk.samtools.SAMLineParser.parseLine(SAMLineParser.java:341)
    at htsjdk.samtools.SAMTextReader$RecordIterator.parseLine(SAMTextReader.java:248)

 

I tried to validate the sam file using picard again:

java -jar /home/everestial007/picard-tools-1.139/picard.jar ValidateSamFile I=SAMPEaligned_MA605readsBWA.sam Mode=VERBOSE MAX_OUTPUT=15000 O=validatebamMA605forGATK

 

But, gettin the following error message in the output file.

ERROR: Record 46, Read name HWI-ST588:81:C080WACXX:7:1101:3935:2321, Mate negative strand flag does not match read negative strand flag of mate
ERROR: Record 155, Read name HWI-ST588:81:C080WACXX:7:1101:9306:2386, Mate negative strand flag does not match read negative strand flag of mate
ERROR: Record 352, Read name HWI-ST588:81:C080WACXX:7:1101:20820:2318, Mate negative strand flag does not match read negative strand flag of mate
ERROR: Record 382, Read name HWI-ST588:81:C080WACXX:7:1101:2880:2595, Mate negative strand flag does not match read negative strand flag of mate
ERROR: Record 562, Read name HWI-ST588:81:C080WACXX:7:1101:10717:2694, Mate negative strand flag does not match read negative strand flag of mate
ERROR: Record 603, Read name HWI-ST588:81:C080WACXX:7:1101:12062:2750, Mate negative strand flag does not match read negative strand flag of mate

 

I followed the bwa manual to letter but getting this kind of errors. I can not figure why is my alignment not working.

 

Thanks in advance.

 

 

 

bwa bam sam picard genome • 2.0k views
ADD COMMENTlink modified 3.1 years ago by Devon Ryan88k • written 3.1 years ago by kirannbishwa01950
1
gravatar for Devon Ryan
3.1 years ago by
Devon Ryan88k
Freiburg, Germany
Devon Ryan88k wrote:

Try adding VALIDATION_STRINGENCY=LENIENT to your picard commands. Picard takes a very strict view of what a proper SAM file should be (including enforcing rules that aren't in the specification).

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by Devon Ryan88k

Thanks for the info. But, should not we be considering the error message. I don't understand what "Mate negative strand flag does not match read negative strand flag of mate" means, and if I should be concerned that my alignments are working properly or not. Btw, sam flagstat reports that 86 % of the reads are aligned - so it is something cocerning.

Thanks,

ADD REPLYlink written 3.1 years ago by kirannbishwa01950

It's possible that your mates are out of sync (i.e., the nth read in Trm_MA605_R1.fastq doesn't match the nth read in Trm_MA605_R2.fastq), so you might check that (you can use reformat from BBMap to do this).

ADD REPLYlink written 3.1 years ago by Devon Ryan88k

Thanks for the info. I a little hesistant in trying to learn a new tool again (just takes too much of my time to set it up to getting it right), but will do it as a last resort. Do you think picard can do it? Also, if if have 86% alignment, should I be worried about it? - just wanting to weigh the pros and cons of the extra step, but I will appreciate any suggestions.

Thanks,

ADD REPLYlink written 3.1 years ago by kirannbishwa01950
1

If the overwhelming majority are marked as "proper pairs" then I wouldn't bother with further processing, the results are likely fine.

ADD REPLYlink written 3.1 years ago by Devon Ryan88k
1

Thanks for enlighting. Here is my samtools flagstat report,

22193026 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
19848288 + 0 mapped (89.43% : N/A)
22193026 + 0 paired in sequencing
11096513 + 0 read1
11096513 + 0 read2
18519235 + 0 properly paired (83.45% : N/A)
19450790 + 0 with itself and mate mapped
397498 + 0 singletons (1.79% : N/A)
366420 + 0 with mate mapped to a different chr
122636 + 0 with mate mapped to a different chr (mapQ>=5)

Since, 83.45% are properly paired so I think I am good to go, but just want to make sure. Also, I used picard's "CleanSam" to remove reads that mapped out of reference sequence.

ADD REPLYlink written 3.1 years ago by kirannbishwa01950
1

That seems reasonable, you should be good to go.

ADD REPLYlink written 3.1 years ago by Devon Ryan88k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1597 users visited in the last hour