Question: Seeing Unexpected Characters (^D,^Q) In The Qual Field Of A Sam File
0
gravatar for edwardhust
6.2 years ago by
edwardhust0
1037 Luoyu Rd., Wuhan, Hubei 430074 P.R. China
edwardhust0 wrote:

i use bwa tool to map raw pair end reads to the reference, then output two sai file. then, i use bwa sampe to convert two sai file to one sam file, it took a long time; finally, i got my result, but there are unexpected messy codes in the QUAL field.

my command follows:

nohup bwa sampe -f /home/liucj/projects/TWINS/WGC007813/WGC007813.sai.sam \
-r "@RG\tID:WGC007813\tLB:WGC007813\tSM:WGC007813\tPL:ILLUMINA" \
/home/liucj/data/ReferAll/index/ucsc.hg19 \
/home/liucj/projects/TWINS/WGC007813/WGC007813_1.fq.sai \
/home/liucj/projects/TWINS/WGC007813/WGC007813_2.fq.sai \
/home/liucj/data/SAMPLES/TWINS/WGC007813/WGC007813_1.fq \
/home/liucj/data/SAMPLES/TWINS/WGC007813/WGC007813_2.fq \
/home/liucj/projects/TWINS/WGC007813/no.saitosam.out &

here is part of sam file http://postimg.org/image/yr3qt1ph9/

enter image description here

sam bwa • 2.1k views
ADD COMMENTlink modified 6.2 years ago by Istvan Albert ♦♦ 82k • written 6.2 years ago by edwardhust0

Can you post what version of bwa you're using and the commands used to generate the sai files? While I assume you don't have a bunch of special characters in the original fastq files, you might just double check to ensure they're not corrupt.

ADD REPLYlink written 6.2 years ago by Devon Ryan93k

good point, look at a qualities in the fastq file,

above everything is really bad quality, perhaps you have an different encoding and once it passes through bwa it gets interpreted in a way that shifts these qualities to be beyond the normal scale, ^D is control character 4

ADD REPLYlink modified 6.2 years ago • written 6.2 years ago by Istvan Albert ♦♦ 82k

thanks for your reply, I found the problem. the fastaq file was not corrupted. in producing SA coordinate process, i set quality score format as illumina 1.3+, but in fact, the score format of my fq files is illumina 1.8+. a stupid mistake.!!! anyway thanks a lot !!

ADD REPLYlink written 6.2 years ago by edwardhust0

I have only seen this when the fastq files are corrupted. Check the md5sums you have for the fastq files and the ones the sequencing center provided.

ADD REPLYlink written 6.2 years ago by Zev.Kronenberg11k

i checked the fastaq files, they was not corrupted. thanks a lot !

ADD REPLYlink written 6.2 years ago by edwardhust0
0
gravatar for Istvan Albert
6.2 years ago by
Istvan Albert ♦♦ 82k
University Park, USA
Istvan Albert ♦♦ 82k wrote:

Just to close and answer this question that others may run into

Turns out the aligner was run with the incorrect fastq encoding - as described in the comments

ADD COMMENTlink written 6.2 years ago by Istvan Albert ♦♦ 82k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1667 users visited in the last hour