ERROR SAMTOOLS GENERATING TRUNCATED FILE: SEQ and QUAL are of different length
2
0
Entering edit mode
4.1 years ago
meilinfer ▴ 10

I would like to convert a bowtie2 generated .sam file to .bam with samtools (v1.9) but I get truncated files with SEQ and QUAL of different lenghts. The error line SEQ and QUAL's are indeed of different lenght but I don't know how to fix this.Why am I getting this error?

$ samtools view -b hiPSC1.sam > hiPSC1.bam

[E::sam_parse1] SEQ and QUAL are of different length

[W::sam_read1] Parse error at line 8272508

[main_samview] truncated file

here's how my .sam file looks:

@SQ SN:chrY LN:57227415

SRR2097503.1.3  83  chrM    6842    1   51M =   6745    -148    TATCCCCACCGGCGTCAAAGTATTTAGCTGACTCGCCACACTCCACGGAAG >>?8/:AA6A@?0DB9DB<?B@A@<DDC?)1EEC=CA?A+D?DDD@;D??? AS:i:0  XS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:51 YS:i:0  YT:Z:CP

and here is the line giving me the error:

$ sed -n 8272508p hiPSC1.sam

SRR2097503.1.4850276    99  chrM    14650   34  51M =   14804   205 CAACAGAAACAAAGCATACATCATTATTCTCGCACGGACTACAACCACGAC CCCFFFFFHHHHHIJJJJJJJJJJJJJJJJJJJJJIGGDJIJJIGJHHHHHFFFFFCCC AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:51 YS:i:0  YT:Z:C
sequencing assembly genome alignment • 8.2k views
ADD COMMENT
1
Entering edit mode

Please go back and check your original data file to see if the problem exists there as well.

ADD REPLY
1
Entering edit mode

The SAM file is corrupted. Maybe an issue during alignment, amybe out-of-memory at some point. I would realign everything from scratch. If it is only this one read then you can also remove it and see if that fixes it. I would still realign:

bowtie2 (...option) | samtools view -o out.bam

which will produce a BAM file right away without intermediate sam files. This is called a Unix pipe.

ADD REPLY
0
Entering edit mode

Thanks AT point I used bowtie2 to align and use a prebuilt index I downloaded from Bowtie's webpage. I'll try realigning again and piping samtools

bowtie2 --very-sensitive --no-discordant --no-mixed -X 2000 -x grch38_1kgmaj -1 hiPSC1_1P.fastq -2 hiPSC1_2P.fastq -p 4 > hiPSC1.sam
ADD REPLY
0
Entering edit mode

You could use a pipe as suggested by @ATPoint or use the correct option to create the SAM file instead of the redirect you used.

bowtie2 --very-sensitive --no-discordant --no-mixed -X 2000 -x grch38_1kgmaj -1 hiPSC1_1P.fastq -2 hiPSC1_2P.fastq -p 4 -S hiPSC1.sam
ADD REPLY
0
Entering edit mode

I did not use the -S!! Thanks I'll give it a go and align again. Thanks.

ADD REPLY
1
Entering edit mode
4.1 years ago
ATpoint 82k

Your command does not contain a pipe. That would be:

bowtie2 --very-sensitive --no-discordant --no-mixed -X 2000 -x grch38_1kgmaj -1 hiPSC1_1P.fastq -2 hiPSC1_2P.fastq -p 4 | samtools view -o hiPSC1.bam
ADD COMMENT
1
Entering edit mode

It worked!! I checked the file with:

samtools quickcheck -v

and it was okay and was even able to sort the file.Thanks!

ADD REPLY
0
Entering edit mode

Cool, glad to hear.

ADD REPLY
0
Entering edit mode
4.1 years ago

It's a problem with your SAM, not with samtools. It looks like a problem in your upstream process and you should not trust this data.

You can always filter out the bad lines with awk....

awk -F '\t' '$0 ~ /^@/ || length($10)==length($11)' input.sam
ADD COMMENT

Login before adding your answer.

Traffic: 3260 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6