Question: gSNAP aligner generating truncated sam file ?
0
gravatar for pinninti1991reddy
23 months ago by
pinninti1991reddy50 wrote:

Hi I tried to align almost more than 10 human genome datasets against the hg38. gsnap able to align, generating a truncated sam file. Can any way, suggest me how to sort it out ?

Command:

./gsnap -d hg38 -D /data/Likith/gmap-2018-05-30/share/hg38 -t 10 /data/shayantan/cancer_samples/sample1/fixed1_normal.fq  /data/shayantan/cancer_samples/sample1/fixed2_normal.fq  > S1.gsnap.sam

Input file sizes:

R1 & R2- 97Gb , 97 Gb
hg38 - 3.1 GB

output file:

S1.gsnap.sam - 410 GB

Viewing the sam file:

$ samtools view S1.testgsnap.sam | head
[W::sam_read1] Parse error at line 1
[main_samview] truncated file
ADD COMMENTlink modified 23 months ago by h.mon29k • written 23 months ago by pinninti1991reddy50

Try just head S1.testgsnap.sam.

Also, are you sure you want all of the datasets merged together like this? Usually you want them as separate files.

ADD REPLYlink written 23 months ago by Devon Ryan95k

Hi, i'm able to view the SAM file with head that is not a problem. when converting from SAM to sort.bam its showing truncated sam file ? Is their any other way, can i convert to sort.bam ?

$ head S1.gsnap.sam 
>ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAAGATCGGAAG  2 unpaired  CCCFFFFFHHHHHJJJJJJJJJJJJJJJJJJJJJJJJJJJJIIJJJJJJJJJJJJJJJJIEHHHFFFEEDCEEDDDDDCDDDDDCDDDDCDCCCCCDBDB<   HSQ-700848:338:D168LACXX:1:2307:7299:80233
ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAAccctaaccc   1..92   +chr5:11491..11582  start:0..end:9,matches:92,sub:0 segs:1,align_score:9,mapq:3 method:ext
ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAAccctaaccc   1..92   -chr22:50807994..50807903   start:0..end:9,matches:92,sub:0 segs:1,align_score:9,mapq:3method:ext
<TTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTAGATCGGAAG  1 unpaired  BC@FFFDDFHHHFGGGIJFHGHIJEHHIJJFHHIJJDFDGHIDHHIJJBFHHIJFHIIJHHHEHFFDFCEEE>B=BDD:AABDDABABDDCCBCDDDD?B>   HSQ-700848:338:D168LACXX:1:2307:7299:80233
TTAGGGTTAGGGTTAGGGTTAGGGTTAa-------------------------------------------------------------------------   1..27   -chr18:10187..10161 start:0..del:1,matches:27,sub:0 segs:3,align_score:8,mapq:40    method:ext-gmap
,---------------------------GGGTTAGGGTTAa-------------------------------------------------------------  28..39  -chr18:10159..10148 del:1..del:1,matches:12,sub:0
,---------------------------------------GGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTAGggttaggg  40..93  -chr18:10146..10093 del:1..end:8,matches:54,sub:0

>CCCTAACCCTGACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAGATCGGAAGAG  1 unpaired  @@@BDEFFFH?DFADGGGGGEGHIGCEDDGHHGGIGIIGIFEGGGIIIIIGHIIIHIGI@ECEHHC@BEC@EEA=A=ABAAACCB<AACAC>@>8<?<9A<   HSQ-700848:338:D168LACXX:1:2111:5061:50207

Then with samtools:

samtools view -u S1.gsnap.sam | samtools sort -@ 20 -T /tmp/S1.gsnap.sam.sort -o  S1.gsnap.sam.sort.bam
[W::sam_read1] Parse error at line 1
[main_samview] truncated file.
ADD REPLYlink modified 23 months ago by Devon Ryan95k • written 23 months ago by pinninti1991reddy50

Ah, it's not actually writing a SAM file! Maybe there's an option to change that?

ADD REPLYlink written 23 months ago by Devon Ryan95k
3
gravatar for h.mon
23 months ago by
h.mon29k
Brazil
h.mon29k wrote:

You have to specify sam as output format with -A sam or --format=sam.

  -A, --format=STRING            Another format type, other than default.
                                 Currently implemented: sam, m8 (BLAST tabular format)
ADD COMMENTlink written 23 months ago by h.mon29k

I'm able to view the output. Thanks

ADD REPLYlink written 23 months ago by pinninti1991reddy50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1106 users visited in the last hour