Using samtools to view an indexed file in a private S3 bucket
0
2
Entering edit mode
4.8 years ago
Brynjar ▴ 20

I'm trying to use samtools to view an indexed CRAM file which is stored on our private s3 bucket. I have the config and credential files

$ cat ~/.aws/config
[default] s3=
addressing_style=path
output=json
region=us-east-1

and

$ cat ~/.aws/credentials
[default]
aws_access_key_id=*****
aws_secret_access_key=*****

I can generate a presigned url with the boto3 python library and the following works:

$ url='http://s3.[endpoint]/brynjar-test/sample.cram?AWSAccessKeyId=****************&Signature=************&Expires=1561026577'
$ samtools view $url | less -S

I cannot use it to view a specific region (could this be done if samtools allowed to specify an index file instead of always appending .crai to the input file name and looking for that index file?)

I also tried the following:

$ samtools view http://s3.[endpoint]/brynjar-test/sample.cram 
[E::hts_open_format] Failed to open file http://s3.[endpoint]/brynjar-test/sample.cram
samtools view: failed to open "http://s3.[endpoint]/brynjar-test/sample.cram" for reading: Permission denied

If the above worked I would assume given that the .crai file was stored in the same bucket that I could specify a region, although I can't verify this.

I tried setting the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY but that does not help. I have read through A: Tool for random access to indexed BAM files in S3? but that post is using a public S3 bucket.

Any idea how I can solve this?

samtools amazon-s3 • 3.1k views
ADD COMMENT
0
Entering edit mode

You're getting a permission denied error - have you confirmed that your ID and key have the appropriate privileges for the bucket and/or whether there are any other restrictions placed on the object?

Can you see it if you use the AWS CLI (e.g., aws s3 ls ...? Do you need to change the region for the call?

ADD REPLY
0
Entering edit mode

Also, make sure you keep the AWSAccessKeyId, signature, etc... parameters after the '?' on your samtools view command, otherwise the (presigned) URL will not be accessible.

ADD REPLY
0
Entering edit mode

The presigned URL works fine apart from not recognizing any index file.

I just noticed that samtools has added an option for specifying an index file but it has not been officially released yet, I think I'll try cloning it (Feat/support passing index files).

ADD REPLY
0
Entering edit mode

Thanks for the reply.

I will have to check better on these privileges. I thought the object file had appropriate ACL for the given credentials I am using, but perhaps not.

I can't use the AWS CLI as our research network is offline and there is probably not a mirror for it. I will have to ask our IT department.

$ aws s3 ls s3://1000genomes/
Could not connect to the endpoint URL: "https://1000genomes.s3.amazonaws.com/?list-type=2&prefix=&delimiter=%2F&encoding-type=url"
ADD REPLY

Login before adding your answer.

Traffic: 1567 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6