Question: Using samtools to view an indexed file in a private S3 bucket
2
gravatar for Brynjar
4 weeks ago by
Brynjar20
deCODE Genetics, Iceland
Brynjar20 wrote:

I'm trying to use samtools to view an indexed CRAM file which is stored on our private s3 bucket. I have the config and credential files

$ cat ~/.aws/config
[default] s3=
addressing_style=path
output=json
region=us-east-1

and

$ cat ~/.aws/credentials
[default]
aws_access_key_id=*****
aws_secret_access_key=*****

I can generate a presigned url with the boto3 python library and the following works:

$ url='http://s3.[endpoint]/brynjar-test/sample.cram?AWSAccessKeyId=****************&Signature=************&Expires=1561026577'
$ samtools view $url | less -S

I cannot use it to view a specific region (could this be done if samtools allowed to specify an index file instead of always appending .crai to the input file name and looking for that index file?)

I also tried the following:

$ samtools view http://s3.[endpoint]/brynjar-test/sample.cram 
[E::hts_open_format] Failed to open file http://s3.[endpoint]/brynjar-test/sample.cram
samtools view: failed to open "http://s3.[endpoint]/brynjar-test/sample.cram" for reading: Permission denied

If the above worked I would assume given that the .crai file was stored in the same bucket that I could specify a region, although I can't verify this.

I tried setting the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY but that does not help. I have read through A: Tool for random access to indexed BAM files in S3? but that post is using a public S3 bucket.

Any idea how I can solve this?

amazon-s3 samtools • 113 views
ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by Brynjar20

You're getting a permission denied error - have you confirmed that your ID and key have the appropriate privileges for the bucket and/or whether there are any other restrictions placed on the object?

Can you see it if you use the AWS CLI (e.g., aws s3 ls ...? Do you need to change the region for the call?

ADD REPLYlink written 4 weeks ago by Brice Sarver2.6k

Also, make sure you keep the AWSAccessKeyId, signature, etc... parameters after the '?' on your samtools view command, otherwise the (presigned) URL will not be accessible.

ADD REPLYlink written 4 weeks ago by Roman Valls Guimerà520

The presigned URL works fine apart from not recognizing any index file.

I just noticed that samtools has added an option for specifying an index file but it has not been officially released yet, I think I'll try cloning it (Feat/support passing index files).

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by Brynjar20

Thanks for the reply.

I will have to check better on these privileges. I thought the object file had appropriate ACL for the given credentials I am using, but perhaps not.

I can't use the AWS CLI as our research network is offline and there is probably not a mirror for it. I will have to ask our IT department.

$ aws s3 ls s3://1000genomes/
Could not connect to the endpoint URL: "https://1000genomes.s3.amazonaws.com/?list-type=2&prefix=&delimiter=%2F&encoding-type=url"
ADD REPLYlink written 4 weeks ago by Brynjar20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 912 users visited in the last hour