Can you predict the size of a BAM file based on the fasta?
0
0
Entering edit mode
3 months ago
James Reeve ▴ 130

I'm apply for computing resources on a cluster that asks for an estimate of storage space. So far I only have compressed fastq files, making it hard to judge how much total space I will need. Is there any rule of thumb for estimating the total size of the BAM/SAM files based on the size of the fastq?

storage mapping bam • 222 views
ADD COMMENT
0
Entering edit mode

Within a ball park. It will depend to a large extent on amount of secondary alignments and/or if you are going to keep unmapped fastq reads in the BAM file.

ADD REPLY
0
Entering edit mode

I guess it would depend on a lot of factors. For my purposes I won't be keeping either secondary alignments or unmapped reads. Hopefully, this helps a bit in coming up with a rough estimate of storage.

ADD REPLY
0
Entering edit mode

'Machine learning' could sample reads from a FASTQ file, and then predict the size of the aligned BAM file, for sure

ADD REPLY

Login before adding your answer.

Traffic: 1055 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6