Question: SAM to BAM conversion problem after HISAT2.
0
gravatar for fmerkal
13 days ago by
fmerkal10
USA/Lowell/University of Massachusetts Lowell
fmerkal10 wrote:

Hello,

I recently used HISAT2 to align my paired-end reads to a reference genome. The command I used was hisat2 -q -p 4 -x path_to_index -1 reads_1.fastq -2 reads_2.fastq -S out.sam. This produced a SAM file that was ~14 GB and had headers that matched the ones I looked up online for this type of file.

I am having trouble converting my SAM output file to a BAM file using the following command samtools view -bS out.sam > out.bam. I should mention that the samtools version I am using is 1.4.1 and the syntax should be ok. The out.bam file is only 45 bytes and when I try to check it with samtools flagstat samtools flagstat out.bam, I get the following errors:

[W::sam_read1] parse error at line 1
[bam_flagstat_core] Truncated file? Continue anyway.

I have tried looking into this problem but couldn't find an answer that related specifically to my issue. Any help would be greatly appreciated. Cheers, Fjodor

rna-seq software error • 122 views
ADD COMMENTlink modified 13 days ago • written 13 days ago by fmerkal10

Output if head out.sam and samtools view out.bam | head? Did you use something like nohup during alignment?

ADD REPLYlink modified 13 days ago • written 13 days ago by ATpoint14k

Hi, I did not use nohup. I am running hisat2 on a cluster and I received an output describing the alignment rate (overall alignment ~92%). I did not see other files, like the various .log files you would get if you ran Tophat. The output I get with head out.sam is listed below.

@HD     VN:1.0  SO:unsorted
@SQ     SN:MT   LN:16775
@SQ     SN:W    LN:6813114
@SQ     SN:Z    LN:82529921
@SQ     SN:1    LN:197608386
@SQ     SN:2    LN:149682049
@SQ     SN:3    LN:110838418
@SQ     SN:4    LN:91315245
@SQ     SN:5    LN:59809098
@SQ     SN:6    LN:36374701

When I run samtools view out.bam | head, I get

[W::sam_read1] parse error at line 1
[main_samview] truncated file.
ADD REPLYlink modified 13 days ago • written 13 days ago by fmerkal10

note how above you run

samtools view out.bam

which is empty so will, of course, raise an error. See what

samtools view out.sam

prints

I think your BAM file is empty, for whatever reason, you would need to run the conversion again and see why it failed. I think it ran of out "something" and truncated the BAM file but the SAM file should be fine.

ADD REPLYlink modified 13 days ago • written 13 days ago by Istvan Albert ♦♦ 79k
1
gravatar for fmerkal
13 days ago by
fmerkal10
USA/Lowell/University of Massachusetts Lowell
fmerkal10 wrote:

Hello, I think I might have solved the problem. I used samtools sort to sort the SAM file and output it to BAM format with the command samtools sort -o out_sorted.bam out.sam. This seems to have created a BAM file that has a reasonable size ~2.9 GB (relative to the SAM file) and that I can view with samtools view out_sorted.bam.

This post was particularly helpful - htseq-count: error when reading beginning of SAM/BAM file. I still don't understand why it wasn't working when I used samtools view -bS out.sam > out.bam, but it seems fine for now. Thank you for all your help :)

ADD COMMENTlink written 13 days ago by fmerkal10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1211 users visited in the last hour