Question: How to rectify error while sorting the sam file
0
gravatar for sunnykevin97
9 months ago by
sunnykevin9710
sunnykevin9710 wrote:

HI I'm converting SAM to sort bam it showing SAM file is truncated. But when i view it using samtools it doesn't display truncated . Is their any other way to convert to bam SRR1.sam = 47 GB

Sorting bam

samtools view -u SRR1.sam | samtools sort -@ 5 -T /tmp/SRR1.samsort -o SRR1.sort.bam 
[W::sam_read1] Parse error at line 10363596
[main_samview] truncated file.
[bam_sort_core] merging from 5 files and 5 in-memory blocks...
alignment assembly genome • 433 views
ADD COMMENTlink modified 9 months ago • written 9 months ago by sunnykevin9710
**samtools view -bS SRR1.sam > SRR1.bam** 
[W::sam_read1] Parse error at line 10363596
[main_samview] truncated file.
SRR1.sam => 47 GB
SRR1.bam => 1.9 GB
samtools quickcheck -v -v -v -v SRR1.bam && echo 'all ok' 
verbosity set to 4
checking SRR1.bam
opened SRR1.bam
SRR1.bam is sequence data
SRR1.bam has 455 targets in header.
SRR1.bam has good EOF block.
all ok
ADD REPLYlink modified 9 months ago by bioexplorer4.2k • written 9 months ago by sunnykevin9710

sunnykevin97 : Please use ADD REPLY/ADD COMMENT when responding to existing answers/comments to keep threads logically organized.

SUBMIT ANSWER is for new answers to the original question.

ADD REPLYlink written 9 months ago by genomax69k

How did you obtain this sam file?

ADD REPLYlink written 9 months ago by WouterDeCoster40k
0
gravatar for bioexplorer
9 months ago by
bioexplorer4.2k
India
bioexplorer4.2k wrote:

first, just convert sam to bam

samtools view -bS your_file.sam > your_file.bam

Using samtools v 1.4.1

samtools quickcheck -v -v -v -v   your_file.bam && echo 'all ok'

Paste the output of your file

ADD COMMENTlink modified 9 months ago • written 9 months ago by bioexplorer4.2k

@Vijay: Samtools is now at v.1.9. You should upgrade.

ADD REPLYlink written 9 months ago by genomax69k

I tried with samtools v.1.9 SRR.sam => 47 GB

samtools view -bS SRR.sam > SRR.bam 
[W::sam_read1] Parse error at line 10363596
[main_samview] truncated file.

SRR.bam => 1.9 GB
samtools quickcheck -v -v -v -v SRR.bam  && echo 'all ok' 
verbosity set to 4
checking SRR.bam
opened SRR.bam
SRR.bam is sequence data
SRR.bam has 455 targets in header.
SRR.bam has good EOF block.
all ok
ADD REPLYlink modified 9 months ago by bioexplorer4.2k • written 9 months ago by sunnykevin9710
1

There is something wrong with your file at that specified line.

Can you pull out lines around that error location: sed -n '10363595,10363597p;10363598q' your_file?

ADD REPLYlink written 9 months ago by genomax69k

sed -n '10363595,10363597p;10363598q' SRR.sam

SRR.5228479 99  chr6    65895862    3   250M    =   65896126    513 TGGCCAATATGAAGGAGCTACTGGCTTCATGAAATTCCAAAATCCAGTGGGTCAGCAATTAAATCTTAAAGCTCCAAAATGACCTCCTTTGATTCCATGTCTCACACTTAGGCATGCTGATACAAGGGGTGGGCTCCCAAGGACTTGGGCAGCTCTGCCTCTGTGGCTCTTCAGGGTACATACCCCATGACTGCTTTCACAGGCTGGCATTGAGTGTCTGCTGTTTTTCCAAGTGCACAGTGCAAGCTGT  DDDDDIIIIIIIIIIGIIIIIIHIIIHHHIIIIIIIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIIIHHIIIIIIIIIIGIIIIIIHIIIGHIIIIIIIIIIIIIIIIIIIIIIHHHIIIIIHHIIIIIIHIIIIHGIIIIIIIIIIIIIIIIIIIGIGIIGGGHHHHHGHEHHIIHFHIHHIIEHHGHIIIIHIHIICHEEC7FEHIHIGH?BAFEEHIHCHFGIHIIHHII?GIIH.  PQ:i:21 SM:i:3  UQ:i:0  MQ:i:0  XQ:i:0  NM:i:0
SRR.5228479 147 chr6    65896126    3   249M    =   65895862    -513    TTCTGGTGTATGGAGGATGGTGGCCATCTTCTCACAGCTCCACTAGGCACTGCCCCAATGGGCACTCAGTGTTGGGGCTCCAACCACACATTTCTCCTCTGTATTGCCCTAGTAGAGGTTTTCCATGAWarning - out of bounds at position (28,-1)
Warning - out of bounds at position (27,-1)
ADD REPLYlink modified 9 months ago by genomax69k • written 9 months ago by sunnykevin9710

You could try to delete that problem record by doing: sed -i '10363596d' your.sam and see if that fixes the problem. Make a backup of the file (if you want to be careful) since the command will make the change in place. Otherwise you may need to re-do alignment.

ADD REPLYlink modified 9 months ago • written 9 months ago by genomax69k

I tried showing truncated sam file sed -i '10363596d' SRR.sam

samtools view -u SRR.sam | samtools sort -@ 10 -T test/SRR.sort -o SRR.bam

[W::sam_read1] Parse error at line 10363596 [main_samview] truncated file. [bam_sort_core] merging from 0 files and 10 in-memory blocks... Generated bam file - 1.8 GB

ADD REPLYlink written 9 months ago by sunnykevin9710

You may want to cut your losses and redo the alignment. Even if you manage to get past this error there may be something else wrong.

ADD REPLYlink written 9 months ago by genomax69k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 897 users visited in the last hour