Question: small .bam files to test GATK Variant Discovery pipeline
gravatar for alongalor
3.1 years ago by
alongalor0 wrote:

Could someone refer me to where I could download the smallest possible .bam files to test my GATK Best Practices Variant Discovery pipeline? My pipeline uses the -L option to parallelize over different chromosomes - I would like to test this functionality and so I would like a full .bam file that has data from all chromosomes and will not cause GATK to crash.

Thanks a lot!

ADD COMMENTlink written 3.1 years ago by alongalor0

Just use any BAM that you have on disk, make a little BED file with one interval per chromosome, e.g. chr1-22 from 6000000-6100000 respectively, and use SAMtools view to get a subset of the whole BAM:

samtools view -bh -o out_subset.bam input.bam -L regions.bed

That should be sufficient for testing purposes.

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by ATpoint39k

This worked perfectly in terms of splitting the bam file but it caused GATK to crash with a strange error... my pipeline of course still works with the original bam file before it was split.

ADD REPLYlink written 3.1 years ago by alongalor0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1168 users visited in the last hour