Question: Storing Fastq As Unaligned Bam
6.9 years ago by
United States
Abhi1.5k wrote:

Hey Guys

Just wondering if anyone there is now storing the raw fastq data as unaligned bam files. We are reaching a stage where any space we could potentially save would be beneficial.

Any con of storing fastq as bam ? I see some discussion about this on seqanswers :

Also any tools that people already have that converts a fastq to bam and vice-a-versa. I know there are few which can do bam to fastq like picard but not sure if fastq to bam is there.

Thanks! -Abhi

fastq bam • 6.2k views
This doesn't read like a stackexcahnge question to me - if you want a discussion why not continue on SEQanswers?

This seems relevant to me, handling large NGS data files is an increasing bioinformatic issue.

6.9 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum112k wrote:

I think a bgzipped fastq file will be always smaller than a BAM file as the BAM file also contains the positions of the alignments.

See also:

Compression of genomic sequences in FASTQ format


Efficient storage of high throughput sequencing data using reference-based compression

6.9 years ago by
toni2.1k wrote:

Hi Abhi,

yes we do this in our team. You can use Picard 'FastqToSam' utility.

Compared to 2 fastq files (plain, not gzipped as suggested by Pierre), an unaligned BAM file allows to save 60°% to 65% of storage space. It is also practical because you can store some useful information (Sample, Library, Run, any useful comments ...) in the header if you want to.

