BAM file header after splitting paired end reads
1
0
Entering edit mode
8.2 years ago
ilyco ▴ 60

Hi,

I am processing BAM files using pysam as part of the HiFive set of tools for analysing Hi-C data. The input required are two BAM files with each one representing the forward and the reverse reads respectively from an alignment of paired-end reads.

In order to split the alignment BAM file into forward and reverse reads, I used:

samtools view -f 0x40 file_chr1.bam > file_chr1_1.bam
samtools view -f 0x80 file_chr1.bam > file_chr1_2.bam

Then discovered that the new BAM files do not have headers so I added the -h tag to the commands:

samtools view -h -f 0x40 file_chr1.bam > file_chr1_1.bam
samtools view -h -f 0x80 file_chr1.bam > file_chr1_2.bam

Now I am getting the following error:

for I in range(len(input.header['SQ'])):
  File "pysam/calignmentfile.pyx", line 1486, in pysam.calignmentfile.AlignmentFile.header.__get__ (pysam/calignmentfile.c:16469)
ValueError: malformatted header: no ':' in field

Any ideas on how to fix this?

Thank you very much

paired-end hifive SAM samtools BAM • 2.9k views
ADD COMMENT
0
Entering edit mode

Can you show me the header of the bam file?

samtools view -H bamFile.bam
ADD REPLY
0
Entering edit mode
8.2 years ago

You have created SAM files, not BAM files. Use -b to force binary.

ADD COMMENT
0
Entering edit mode

Thank you very much. I used -b and made sure it is a BAM file. Even so, it still would not let me read the file and output "malformatted header".

I then used samtools reheader and removed chromosome fragments e.g. chr_Un from the header.

HiFive still does not recognise the pair of reads and outputs "data is not valid" but at least pysam is reading the BAM files.

ADD REPLY

Login before adding your answer.

Traffic: 1398 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6