Question: Unsynchronized simultaneous reads on a single .bam file
0
gravatar for kennethlim206
2.4 years ago by
kennethlim2060 wrote:

Hello,

I am running samtools mpileup multiple times simultaneously on the same .bam file, and I am wondering if reading the same .bam file simultaneously could cause any problems/corrupt my output? For example, if I ran these commands simultaneously:

samtools mpileup -v -u --region chr1:1-1000 --output region1_mileup_output.txt my_reads.bam
samtools mpileup -v -u --region chr2:1-1000 --output region2_mileup_output.txt my_reads.bam
samtools mpileup -v -u --region chr3:1-1000 --output region3_mileup_output.txt my_reads.bam
  1. I know I could list the specific regions in the .bed file and input that into mpileup, but for my purposes it is most convenient to have the reads for the regions outputted to separate output files.
  2. I know I could run the commands sequentially, but I would prefer to run them simultaneously to save time.

I am planning on scaling up and simultaneously reading a .bam file a few hundred times with different regions specified. Will this become a problem once I scale up?

Thanks!

mpileup samtools bam • 666 views
ADD COMMENTlink modified 2.4 years ago by Brian Bushnell17k • written 2.4 years ago by kennethlim2060
1

Consider using samtools mpileup on the whole chromosome, then use BEDTools intersect to get the overlap between your vcf and a bedfile of locations.

ADD REPLYlink written 2.4 years ago by swbarnes27.5k
1
gravatar for Brian Bushnell
2.4 years ago by
Walnut Creek, USA
Brian Bushnell17k wrote:

Simultaneous reads don't cause problems; concurrency problems only occur when there is a mix of reads and writes. 100% reads or 100% writes are fine (100% writes is fine because the data is never read so the final state cannot be observed).

Some filesystems may perform better or worse with lots of simultaneous processes reading from the same file, so just make a note of whether the performance dramatically declines.

As swbarnes2 notes it's more efficient to read the full file once.

ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by Brian Bushnell17k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 716 users visited in the last hour