pysam error when reading .bam file ValueError: file has no sequences defined (mode='rb') - is it SAM/BAM format?
0
0
Entering edit mode
7 months ago
daewowo ▴ 20

Error:

f = pysam.AlignmentFile("SRA_sorted.bam","rb")
File "pysam/libcalignmentfile.pyx", line 991, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='rb') - is it SAM/BAM format? Consider opening with check_sq=False


Process I followed to get to the error:

I downloaded a SRA dataset from NCBI and used SRAtools sam-dump to convert the SRA into a sam file.

sam-dump --output-file SRA.sam SRA


I checked the file:

samtools quickcheck SRA.sam


I then checked with picard:

java -jar gatk-package-4.1.9.0-local.jar ValidateSamFile I=SRA.sam MODE=SUMMARY
Error Type  Count


Looking at the sam file with head it looks OK

1   77  *   0   0   *   *   0   0   TACAGAA...


I used the following to convert to bam file:

samtools sort SRA.sam -o SRA_sorted.bam


I confirmed that the file is binary format

I then used the .bam file in a third party program which uses pysam. The pysam command which threw the error:

f = pysam.AlignmentFile("SRA_sorted.bam","rb")
File "pysam/libcalignmentfile.pyx", line 991, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='rb') - is it SAM/BAM format? Consider opening with check_sq=False


I ran picard on the bam file which gave same errors as sam file shown above.

How can I work out exactly what the error is with pysam opening the file and fix this?

pysam bam sam • 809 views
2
Entering edit mode

Looking at the sam file with head it looks OK

it's not. The header is missing.

1
Entering edit mode

Are you sure this is a valid SAM file that you dumped? These seem to be no headers and the single line you posted as an example looks like an unaligned read. The read ID is also 1 which is odd. If the original data submitted was fastq you should align the reads yourself to get SAM/BAM files.

0
Entering edit mode

Thanks I ran bwa to index to a reference genome and then aligned.

Now the .sam file has a header (and now I know it needs one) :-)

0
Entering edit mode

So things are working now?

0
Entering edit mode

Yes, thank you