Question: Downloading BAM files GEO/SRA
1
gravatar for ilobelo
2.9 years ago by
ilobelo10
ilobelo10 wrote:

Hey, I need to download BAM files of breast cancer cell lines from GEO/SRA. For example I will use SRR925780.

I tried to do it in 2 ways:

  1. SRA run browser: http://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR925780. Where I need to download a separate file for each chromosome but the download is very fast (4 Gb in about 10 minutes) and the output file is a BAM file which means no other tool is needed.

  2. SRA toolkit, following their manual, I run this command:

    sam-dump SRR925780 | samtools view -bS - > SRR925780.bam

It takes about 3 hours to download and convert 100 Mb! The time diff is too big, I am wondering what am I doing wrong with the SRA toolkit ans samtools.

BTW I work with the latest SRA toolkit but the samtools version is old, it's the only one I found working for Windows: https://bow.codeplex.com/releases

So my questions are:

  1. Could it be the fastest way to download BAM files is manually via SRA run browser ?
  2. Is there a way to run a newer version of samtools on Windows?

Thanks!

sratoolkit samtools bam sra geo • 3.5k views
ADD COMMENTlink written 2.9 years ago by ilobelo10

You may be better off downloading the fastq files and doing the alignments yourself. EBI-ENA has the fastq files available directly without having to use SRA toolkit (e.g. http://www.ebi.ac.uk/ena/data/view/SRR925780 ).

That said if you are restricted to using windows then all bets are off.

ADD REPLYlink written 2.9 years ago by genomax68k

Some SRA runs are based on custom reference sequences. Is it possible to retrieve the reference FASTAs from SRA and align reads to them to create BAMs? Otherwise you would need to retrieve the BAMs directly right?

ADD REPLYlink modified 16 days ago • written 16 days ago by pnguyen0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 668 users visited in the last hour