pysam alignment error
2
0
Entering edit mode
5.0 years ago
rthapa ▴ 90

Hi,

I am trying to read bam file using a command: samfile = pysam.AlignmentFile(os.path.join(path,bam),'rb') but I keep on getting an error:

Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
  File "pysam/libcalignmentfile.pyx", line 741, in pysam.libcalignmentfile.AlignmentFile.__cinit__
  File "pysam/libcalignmentfile.pyx", line 990, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='rb') - is it SAM/BAM format? Consider opening with check_sq=False

I checked my bam files and they look fine. All of my bam files have headers too. Could anyone provide any suggestion?

Thanks

alignment • 5.7k views
ADD COMMENT
0
Entering edit mode

I checked my bam files and they look fine.

How did you check and how did you conclude the files are fine?

ADD REPLY
0
Entering edit mode

I used samtools flagstat to see if the alignment is okay and I got the output. It didn't throw me any error,

25906192 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
20865552 + 0 mapped (80.54% : N/A)
25906192 + 0 paired in sequencing
12953096 + 0 read1
12953096 + 0 read2
20348218 + 0 properly paired (78.55% : N/A)
20501230 + 0 with itself and mate mapped
364322 + 0 singletons (1.41% : N/A)
152966 + 0 with mate mapped to a different chr
152966 + 0 with mate mapped to a different chr (mapQ>=5)
ADD REPLY
0
Entering edit mode

Hello,

can you use samtools to have a look into the the bam file?

$ samtools view -h input.bam

Or does this give you an error as well?

fin swimmer

ADD REPLY
0
Entering edit mode

When I used samtools view -h input.bam - I get the following output,

@SQ SN:super_3314   LN:1110
@SQ SN:super_3316   LN:1100
@SQ SN:super_3319   LN:1072
@SQ SN:super_3320   LN:1071
@SQ SN:super_3322   LN:1061
@SQ SN:super_3323   LN:1039
@SQ SN:super_3326   LN:1005
@SQ SN:Chr01    LN:80884392
@SQ SN:Chr02    LN:77742459
@SQ SN:Chr03    LN:74386277
@SQ SN:Chr04    LN:68658214
@SQ SN:Chr05    LN:71854669
@SQ SN:Chr06    LN:61277060
@SQ SN:Chr07    LN:65505356
@SQ SN:Chr08    LN:62686529
@SQ SN:Chr09    LN:59416394
ADD REPLY
1
Entering edit mode

Ok,

just another guess: Are you sure you have a bam and not a sam file? What's the output of head input.bam?

fin swimmmer

ADD REPLY
0
Entering edit mode

The output of head .bam looks like this,

$ head SRR5748777.bam 
?BC?З/hBAM?X@HD VN:1.0  SO:unsorted
@PG ID:GSNAP    PN:gsnap    VN:2018-03-25   CL:gsnap.sse42 -t 4 -m 0.02 -B 5 -n 1 -Q --nofails -d data -D /work/rthapa/GO/gsnap_indexes -A sam SRR5748777_1.fastq SRR5748777_2.fastq
@SQ SN:Chr10    LN:61233695
@SQ SN:super_16 LN:1391575
@SQ SN:super_18 LN:1172180
@SQ SN:super_20 LN:888508
@SQ SN:super_22 LN:639901
@SQ SN:super_25 LN:641756
@SQ SN:super_26 LN:650379
@SQ SN:super_27 LN:526259
ADD REPLY
1
Entering edit mode

This is not a bam file. It is a sam file.

pysam.AlignmentFile(os.path.join(path,bam),'rb')

Just remove the 'rb' part. Because this opens the file in binary mode, which you don't have.

ADD REPLY
0
Entering edit mode

I used samtools view -h input.sam > output.bam to convert the sam file to bam file. I do not know why it didn't convert to bam file. As you suggested I opened the file removing 'rb',

pysam.AlignmentFile(os.path.join(path,bam))

But still getting the following error for

samfile.count(mychr, start, stop)

Traceback (most recent call last):
  File "<stdin>", line 39, in <module>
  File "pysam/libcalignmentfile.pyx", line 1431, in pysam.libcalignmentfile.AlignmentFile.count
  File "pysam/libcalignmentfile.pyx", line 1112, in pysam.libcalignmentfile.AlignmentFile.fetch
ValueError: fetching by region is not available for SAM files
ADD REPLY
2
Entering edit mode

The fetch() method doesn't allow random access. Convert to bam and coordinate sort your sam file:

$ samtools sort -o out.bam input.bam

samtools view -h input.sam > output.bam

The standard output by samtools is sam. So this command doesn't convert to bam. You have to use the parameter -Ob to get this. Or instead of redirecting to a file (> output.bam), use the -o parameter to define the output file. samtools recognizes the file extension, and will convert.

ADD REPLY
0
Entering edit mode

thanks a lot, it finally worked.

ADD REPLY
0
Entering edit mode

I am not sure if updating the newest version of pysam solved this issue. After updating the version, it is not throwing me the error. But I am getting other error while counting in the sam file, samfile.count(mychr, start, stop).

Traceback (most recent call last):
  File "<stdin>", line 39, in <module>
  File "pysam/libcalignmentfile.pyx", line 1431, in pysam.libcalignmentfile.AlignmentFile.count
  File "pysam/libcalignmentfile.pyx", line 1112, in pysam.libcalignmentfile.AlignmentFile.fetch
ValueError: fetching by region is not available for SAM files
ADD REPLY
0
Entering edit mode
5.0 years ago
skbrimer ▴ 740

I have not used pysam before but have you tried the suggestion the code gives samfile = pysam.AlignmentFile(os.path.join(path,bam), check_sq=False, 'rb') ? also are you using Windows to process the file the rb call is for windows, linux/mac would be just r

ADD COMMENT
0
Entering edit mode
5.0 years ago
Jianyu ▴ 580

A similar error on github: https://github.com/pysam-developers/pysam/issues/442

Maybe updating the version of your pysam can fix the problem?

ADD COMMENT

Login before adding your answer.

Traffic: 902 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6