Question: Trying To Pull Regions From An Online Bam File Without Downloading
gravatar for Wjeck
7.5 years ago by
Chapel Hill, NC
Wjeck480 wrote:

I am trying to pull regions from a bunch of BAM files on an online server. I'd like to pull the reads mapping to a certain 1kb or so chunk and download them for analysis. They are far too massive to download them all, and it's impractical even to wget them one at a time and pull the regions out using samtools (tried it, and it worked, but it took forever). Since I'll have to do this for a number of regions that I won't know in advance, I need a better way.

I noticed that samtools is capable of running 'samtools view' off of a web address. Sadly, this data is protected behind an https server, which samtools doesn't know how to handle. I notice that IGV is able to read the BAM files of the net by asking for my login and querying specific regions only that I bring up to view, but I don't have a way of automating the process on hundreds of files.

Does anyone have any ideas of how to run something like samtools view on specific regions over an https connection?

bam samtools igv • 3.0k views
ADD COMMENTlink modified 9 months ago by Biostar ♦♦ 20 • written 7.5 years ago by Wjeck480

did you try to put your password in the url ? e.g: ""

ADD REPLYlink written 7.5 years ago by Pierre Lindenbaum125k

This does work for me in my case, thanks Pierre

ADD REPLYlink written 2.7 years ago by cmdcolin1.3k

Doesn't seem to work for me. I am not sure that samtools recognizes https is a web address. The response I get is:

open: No such file or directory
[main_samview] fail to open "https://uname:*******@website/my.bam" for reading.

(with my website, password etc, of course)

ADD REPLYlink modified 4 weeks ago by RamRS25k • written 7.5 years ago by Wjeck480
gravatar for Jorge Amigo
7.5 years ago by
Jorge Amigo11k
Santiago de Compostela, Spain
Jorge Amigo11k wrote:

samtools, and other direct-BAM-access programs like IGV, are capable of opening local files as well as remotely and publicly served files. these remote locations include ftp and http protocols, but unfortunately do not include encrypted transfer protocols such as scp, ssh nor https. it's not a matter of authentication, which is solved on http and ftp, but of handling data encryption which is far more complicated. the only way you may work with all that BAM files you are interested in is either asking the server managers to open them through http, or either downloading them all and dealing with them locally.

ADD COMMENTlink modified 7.5 years ago • written 7.5 years ago by Jorge Amigo11k

Just a note, contrary to this post IGV does seem to be able to handle https and the associated encryption, but samtools does not.

ADD REPLYlink written 7.5 years ago by Wjeck480

good to know that. thanks for the information.

ADD REPLYlink written 7.4 years ago by Jorge Amigo11k
gravatar for umer.zeeshan.ijaz
5.8 years ago by
Glasgow, UK
umer.zeeshan.ijaz1.8k wrote:

Possibly can be done through curl. In the following command, --negotiate option enables SPNEGO in curl. The -u option is required but the user name is ignored. The -b and -c options are used to store and send HTTP cookies. The -s is to silence curlchucking out status. (Try typing the command as it is as the bam file exists)

curl --negotiate -u : -b ~/cookienumnumnum.txt -c ~/cookienumnumnum.txt -s | samtools view -h - | head

will give

@SQ    SN:chr17    LN:78774742
chr17_15_201_1:0:0_1:0:0_209a6    163    chr17    15    60    50M    =    152    187    GTTCCTGCATAGATAATTGCATGACAATTGCCTTGTCCCTCCTGAATGTG    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:1    SM:i:37    AM:i:23    X0:i:1    X1:i:0    XM:i:1    XO:i:0    XG:i:0    MD:Z:40G9
chr17_45_242_0:0:0_1:0:0_50f5f    99    chr17    45    60    50M    =    193    198    CCTTGTCCCTGCTGAATGTGCTCTGGGGTCTCTGGGGTCTCACCCACGAC    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:0    SM:i:37    AM:i:37    X0:i:1    X1:i:0    XM:i:0    XO:i:0    XG:i:0    MD:Z:50
chr17_123_290_3:0:0_1:0:0_27b3b    163    chr17    123    60    50M    =    241    168    ATAACAAACATATGTCCAGCGAATACCTGCATCCCTAGAAGTGAAGCGAC    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:3    SM:i:25    AM:i:25    X0:i:1    X1:i:0    XM:i:3    XO:i:0    XG:i:0    MD:Z:0T10C35C2
chr17_15_201_1:0:0_1:0:0_209a6    83    chr17    152    60    50M    =    15    -187    CATCCCTAGAAGTGAAGCCACCGCCCAAAGACACGCCCATATCCAGCTTA    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:1    SM:i:23    AM:i:23    X0:i:1    X1:i:1    XM:i:1    XO:i:0    XG:i:0    MD:Z:40G9
chr17_164_380_1:0:0_0:0:0_aa5e4    99    chr17    164    60    50M    =    331    217    TGAAGCCACCGCCCAATGACACGCCCATGTCCAGCTTAACCTGCATCCCT    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:1    SM:i:37    AM:i:37    X0:i:1    X1:i:0    XM:i:1    XO:i:0    XG:i:0    MD:Z:16A33
chr17_45_242_0:0:0_1:0:0_50f5f    147    chr17    193    60    50M    =    45    -198    TCCAGCTTAACCTGCATCCCTAGAAGGGAAGGCACCGCCCAAAGACACGC    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:1    SM:i:37    AM:i:37    X0:i:1    X1:i:0    XM:i:1    XO:i:0    XG:i:0    MD:Z:26T23
chr17_204_401_0:0:0_0:0:0_43ea1    99    chr17    204    60    50M    =    352    198    CTGCATCCCTAGAAGTGAAGGCACCGCCCAAAGACACGCCCATGTCCAGC    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:0    SM:i:23    AM:i:23    X0:i:1    X1:i:1    XM:i:0    XO:i:0    XG:i:0    MD:Z:50
chr17_224_415_1:0:0_1:0:0_a2d53    163    chr17    224    60    50M    =    366    192    GCACCGCCCAAAGACACGCCCATGTCCAGCTTATTCTCCCCAGTTCCTCT    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:1    SM:i:37    AM:i:37    X0:i:1    X1:i:0    XM:i:1    XO:i:0    XG:i:0    MD:Z:37G12
chr17_123_290_3:0:0_1:0:0_27b3b    83    chr17    241    60    50M    =    123    -168    GCCCATGTCCAGCTTATTCTGCCCAGTTCCTCTCCAGATAGGCTGCATGG    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:1    SM:i:37    AM:i:25    X0:i:1    X1:i:0    XM:i:1    XO:i:0    XG:i:0    MD:Z:38A11

Best Wishes,

ADD COMMENTlink modified 4 weeks ago by RamRS25k • written 5.8 years ago by umer.zeeshan.ijaz1.8k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 692 users visited in the last hour