Hi all,
I am currently analyzing a pop-seq dataset and encounter some problems when trying to convert my .sam file into a .bam file.
Here is what i did so far:
1) I indexed the genome using Bowtie2 v. 2.4.2
bowtie2-build GCF_000001215.4_Release_6_plus_ISO1_MT_genomic.fna philip_indexed_genome
2) I then mapped the reads using Bowtie2 v. 2.4.2
bowtie2 -x philip_indexed_genome -1 b2_HCHKKDSXY_L1_1.fq -2 b2_HCHKKDSXY_L1_2.fq -S b2_bowtie2_to_indexed_genome.sam
When looking at the output, the mapping seems to have been successful:
302546831 reads; of these:
302546831 (100.00%) were paired; of these:
31467015 (10.40%) aligned concordantly 0 times
217839984 (72.00%) aligned concordantly exactly 1 time
53239832 (17.60%) aligned concordantly >1 times
31467015 pairs aligned concordantly 0 times; of these:
10785696 (34.28%) aligned discordantly 1 time
20681319 pairs aligned 0 times concordantly or discordantly; of these:
41362638 mates make up the pairs; of these:
26239879 (63.44%) aligned 0 times
7729764 (18.69%) aligned exactly 1 time
7392995 (17.87%) aligned >1 times
95.66% overall alignment rate
3) However, when trying to convert the resulting .sam file, into a .bam file using (I used samtools 1.9)
samtools view -Sb b2_bowtie2_to_indexed_genome.sam > b2_bowtie2_to_indexed_genome.bam
I get the following message:
[W::sam_read1] Parse error at line 545655373
[main_samview] truncated file.
I tried to look at that line using
sed -n 545655372p b2_bowtie2_to_indexed_genome.sam > output_from_sam2
but the output file is empty. Besides, I counted the number of rows in the sam file using
wc -l b2_bowtie2_to_indexed_genome.sam
and got
198853022
which is less than the line flagged as problematic.
Does anybody have an idea of what's going on? Thanks a lot Philip
This solved the problem. Thanks!
Thanks for the reply. Ill try that!