Error when trying to use samtools to convert a sam files to a bam file
2
0
Entering edit mode
6.5 years ago

I am attempting to convert a SAM file that I obtained as output from Bowtie, in to a BAM file so that I can use it in MACS2. (This is a input file).

Here is the command line I use when attempting to do this:

samtools view -bS /Users/Carlos/Desktop/Enhancer.Project/HCT116_Enhancers/Input_HCT116.sam > Input_HCT116.bam


However, I continually get this error (I re-aligned the fastq files twice to make sure this wasn't a bowtie error, and the downstream sam to bam conversion error was the same):

[W::sam_read1] parse error at line 28
[main_samview] truncated file.


Here is what the data looks like:

Carlos$samtools view -H /Users/Carlos/Desktop/Enhancer.Project/HCT116_Enhancers/Input_HCT116.sam @HD VN:1.0 SO:unsorted @SQ SN:chr1 LN:249250621 @SQ SN:chr2 LN:243199373 @SQ SN:chr3 LN:198022430 @SQ SN:chr4 LN:191154276 @SQ SN:chr5 LN:180915260 @SQ SN:chr6 LN:171115067 @SQ SN:chr7 LN:159138663 @SQ SN:chr8 LN:146364022 @SQ SN:chr9 LN:141213431 @SQ SN:chr10 LN:135534747 @SQ SN:chr11 LN:135006516 @SQ SN:chr12 LN:133851895 @SQ SN:chr13 LN:115169878 @SQ SN:chr14 LN:107349540 @SQ SN:chr15 LN:102531392 @SQ SN:chr16 LN:90354753 @SQ SN:chr17 LN:81195210 @SQ SN:chr18 LN:78077248 @SQ SN:chr19 LN:59128983 @SQ SN:chr20 LN:63025520 @SQ SN:chr21 LN:48129895 @SQ SN:chr22 LN:51304566 @SQ SN:chrX LN:155270560 @SQ SN:chrY LN:59373566 @SQ SN:chrM LN:16571 @PG ID:Bowtie VN:1.1.2 CL:"bowtie --wrapper basic-0 -t /Users/Carlos/Downloads/hg19.ebwt/hg19 -S -m 1 -p 4 -q /Users/Carlos/Desktop/Enhancer.Project/HCT116_Enhancers/Input_S1_L001_R1_001.fastq,/Users/Carlos/Desktop/Enhancer.Project/HCT116_Enhancers/Input_S1_L002_R1_001.fastq,/Users/Carlos/Desktop/Enhancer.Project/HCT116_Enhancers/Input_S1_L003_R1_001.fastq,/Users/Carlos/Desktop/Enhancer.Project/HCT116_Enhancers/Input_S1_L004_R1_001.fastq,  I've attempted to solve this, and looked online but could find nothing that helps me understand what the error is and how to troubleshoot it. Can anyone help? samtools • 15k views ADD COMMENT 0 Entering edit mode 6.5 years ago Which version of SAMtools are you using? The syntax for the latest version is: samtools view -b -o OUTPUT.bam INPUT.sam  ADD COMMENT 0 Entering edit mode I get the same error when running that command. This is my version: carloss-imac:scripts Carlos$ samtools --version
samtools 1.2
Using htslib 1.2.1
Copyright (C) 2015 Genome Research Ltd.

3
Entering edit mode

How many lines are your SAM file? (command: wc -l INPUT.sam)

And what is the problematic line 28? (command: sed -n 28p INPUT.sam)

0
Entering edit mode

Sorry, didn't have access to data during the weekend. Here's the results:

19286734 /Users/Carlos/Desktop/Enhancer.Project/HCT116_Enhancers/Input_HCT116.sam

NS500579:38:HC3MLBGXX:1:11101:2698:1069    4    *    0    0    *    *    0    AATGCNGTCATCACAGGAAACATTCTGAGAATGCTTCTGTCTAGGTTTG    AA6AA#EEEEEEEEEEEEEEEEEEEEEAEE/EEEEEEEEEEEEEAEEEE    XM:i:1

0
Entering edit mode

Is it possible that the workstation I am doing this on simply can't handle the bowtie alignment and is not working? I'm running a iMAC with 8gb of RAM.

I ask because when I do head -50, I end up with something like this:

@HD    VN:1.0    SO:unsorted
@SQ    SN:chr1    LN:249250621
@SQ    SN:chr2    LN:243199373
@SQ    SN:chr3    LN:198022430
@SQ    SN:chr4    LN:191154276
@SQ    SN:chr5    LN:180915260
@SQ    SN:chr6    LN:171115067
@SQ    SN:chr7    LN:159138663
@SQ    SN:chr8    LN:146364022
@SQ    SN:chr9    LN:141213431
@SQ    SN:chr10    LN:135534747
@SQ    SN:chr11    LN:135006516
@SQ    SN:chr12    LN:133851895
@SQ    SN:chr13    LN:115169878
@SQ    SN:chr14    LN:107349540
@SQ    SN:chr15    LN:102531392
@SQ    SN:chr16    LN:90354753
@SQ    SN:chr17    LN:81195210
@SQ    SN:chr18    LN:78077248
@SQ    SN:chr19    LN:59128983
@SQ    SN:chr20    LN:63025520
@SQ    SN:chr21    LN:48129895
@SQ    SN:chr22    LN:51304566
@SQ    SN:chrX    LN:155270560
@SQ    SN:chrY    LN:59373566
@SQ    SN:chrM    LN:16571
@PG    ID:Bowtie    VN:1.1.2    CL:"bowtie --wrapper basic-0 -t /Users/Carlos/Downloads/hg19.ebwt/hg19 -S -m 1 -p 4 -q /Users/Carlos/Desktop/Enhancer.Project/HCT116_Enhancers/SRR2229188 _1__1.fastq,/Users/Carlos/Desktop/Enhancer.Project/HCT116_Enhancers/SRR2229189_1.fastq,/Users/Carlos/Desktop/Enhancer.Project/HCT116_Enhancers/SRR2229190_1.fastq,/Users/Carlos/Desktop/Enhancer.Project/HCT116_Enhancers/SRR2229191_1.fastq,"
SRR2229188    4    *    0    0    *    *    0    0    AATGCNGTCATCACAGGAAACATTCTGAGAATGCTTCTGTCTAGGTTTGN    AA6AA#EEEEEEEEEEEEEEEEEEEEEAEE/EEEEEEEEEEEEEAEEEE#    XM:i:1
SRR2229188    4    *    0    0    *    *    0    0    TTAGANCTACATTCTCTAATATTTATAAATGATATCACATTGTGCCTGNN    AA6AA#EEEEEEEEEEEAEEEEEEEEEEEEEEEEEEE<EEEEEEE6EE##    XM:i:1
SRR2229188    4    *    0    0    *    *    0    0    GTGCTNTGCTTTTAGATATGCATACACATAAACATCTCAATGCTTTACAN    AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE#    XM:i:1
SRR2229188    4    *    0    0    *    *    0    0    CCGTCNCTACTAAAAAAAAATACAAACAATTAGCCAGGCATGGTGGCAGG    AAAAA#AAEAEEEEEEEEEEEEEEEEEEEEEEEAEEEEEAEEE//EA/AE    XM:i:1
X?<?XT<?gg??+?SRR2229188    4    *    0    0    *    *    0    0    CCCCANCGCGGCCCTGAGCTTCCCGCGCCCCCACCGCTGCCCTGAGCTTC    AAAAA#EEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEE    XM:i:1
X?<?XT<?gg??+?SRR2229188    4    *    0    0    *    *    0    0    GGATANAGAGTCAAGACGTATCAGTGTGCTGTATTCAGGAAACCCATCTC    AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEE    XM:i:1
SRR2229188    4    *    0    0    *    *    0    0    AAACANAAAAAAATTAGTACTAAGCTTGAATGTACTTCCCACAGAAGGCN    AAAAA#EEEEEEEEEE6EEEEEEE/EEAEEAEE//EEEAAE/E/E/EE/#    XM:i:0
SRR2229188    4    *    0    0    *    *    0    0    TACTCNTAAAACTAGGCGGCTATGGTATAATACGCCTCACACTCATTCTC    /AAAA#EAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEE    XM:i:1
SRR2229188    4    *    0    0    *    *    0    0    CAAGGNCAAGTCTTTTTTTTTTTTTTTTTTTTTAATATGAGGGCAAGCAC    6AAAA#EEEAEEEAEEEEEEEAEE/EAEE/<A//////////////////    XM:i:0
SRR2229188    4    *    0    0    *    *    0    0    TCTATNTGTAGTATCTGGAAGTGGACATTTGGAGGGCTTTGTAGCCTATG    AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE    XM:i:1
?#?
?J?#??n
8??? p??=?'??M*? p??p
??M*??M*<??=?,?=?0+??(?>?/,??n
80?T??0p?v
=?00?? p??p
/??0???a
=?,w
=?X?<?X<?vV???0+??(?>?/,??n
80?T??0p?v
=?00?? p??p
/??0???a
=?,w
=?SRR2229188    4    *    0    0    *    *    0    0    GTATCNTCTGGCAAGCATAGGGGACTGCAGTCGACAATGCTGCTGANNNN    6AAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEE####    XM:i:1
?#?
?J?#??n
8??? p??=?'??M*? p??p
??M*??M*<??=?,?=?X?<?X<?vV????#?
?J?#??n
8??? p??=?'??M*? p??p
??M*??M*<??=?,?=?X?<?X<?vV???0+??(?>?/,??n
80?T??0p?v
=?00?? p??p
/??0???a
=?,w


And I somehow feel that all those ?'s should not be there.

0
Entering edit mode

This SAM is different from the one shown above. It's difficult to help you troubleshoot if the information you provide is not consistent.

This file is full of errors. It appears that Bowtie failed. The specs suggest that memory should be sufficient, but you may want to repeat using only one core. Also, the syntax does not match the version of Bowtie that you're using (1.1.2). Check the manual. Finally, you may also need to rebuild the reference index if was generated from an earlier version (I'm not sure on this last point). There are also instructions for building the index on low memory machines.

0
Entering edit mode

The problematic line 28 in the earlier file does not match the SAMtools specifications (see here). It's missing a required field (8 or 9).

0
Entering edit mode

Sorry, I am aware that the files are different because I attempted to re-do the bowtie and then sam to bam conversion again. Both files had these same errors. So they must both be failing at the bowtie step. Let me try your suggestions (the syntax might definitely be a problem since I'm simply following a protocol from our prior bioinformatician since I'm basically just a wet lab research tech and it's been an uphill battle to say the least). The index should be pre-built I simply downloaded it. I will post another comment when it's done. Thank you for the help so far!

0
Entering edit mode

From the Bowtie2 website:

#### "Are Bowtie 2 and Bowtie 1 genome indexes compatible?

No. Bowtie 2 indexes are formatted differently. Bowtie 1 indexes do not work with Bowtie 2 and Bowtie 2 indexes do not work with Bowtie 1.

Unless you are positive that the index was generated with the same version that you're using, you should rebuild it. That also allows you to take advantage of the low memory option.

0
Entering edit mode

I'm positive that the index was built for bowtie, since I used bowtie's prebuilt options. I've run bowtie before and have never run into these problems so I have no idea what's wrong. I went ahead and redid everything using the different syntax, and redownloaded the prebuilt index, and using one processor, but I have the same error on the same line.

0
Entering edit mode
1. Please post the command that you used in your most recent attempt to run Bowtie.
3. If you've successfully run Bowtie before on the identical computer using the identical index and identical command line (except for a different input), then the input files are the problem. If your previous Bowtie experience was not with the identical computer/index/command line, then any of these variables might be the culprit.
0
Entering edit mode
6.5 years ago
Ian 5.8k

MACS does not only accept SAM files. It will accept "BAM" or "BAMPE" if it is paired-end data.

If samtools index runs to completion using the BAM file then the file should be OK. You could also run samtools flagstat to make sure the number of total reads is what you expect.

0
Entering edit mode

I am aware that MACS will accept SAM or BAM files. The problem is that all my other files already have BAM files, and only my input data refuses to convert from SAM to BAM. The resulting BAM file is only 687 bytes so I doubt that it's okay to use when the SAM file is over 5gb large.