Question: BAM to FASTQ compression ratio
0
gravatar for win
2.7 years ago by
win690
India
win690 wrote:

Hi all,

I downloaded the Denisova BAM file from here: ftp://cdna.eva.mpg.de/denisova/alignments/T_hg19_1000g.bam

Since this file is aligned to the 1000 genomes reference genome and I would like to align it to grch38 I started the following utility for BAM to FASTQ conversion http://gsl.hudsonalpha.org/information/software/bam2fastq

After the file converted the FASTQ file is about 180GB whereas the BAM file is 80GB.

My question is does this seem correct and is there a way to estimate the size of FASTQ from a BAM file?

If there is faster converter please could you share, thanks in advance.

bam fastq • 2.3k views
ADD COMMENTlink modified 2.7 years ago by tszn198490 • written 2.7 years ago by win690
0
gravatar for Devon Ryan
2.7 years ago by
Devon Ryan73k
Freiburg, Germany
Devon Ryan73k wrote:

BAM files are compressed, the fastq file that that outputs isn't, so yes, that's not unreasonable. Given the notice at the top of the bam2fastq page, I wouldn't be surprised if Picard is faster. It might also allow writing compressed files (no clue, I've never used its SamToFastq command).

ADD COMMENTlink written 2.7 years ago by Devon Ryan73k
1

FYI: SamToFastq

(...)

COMPRESSION_LEVEL=Integer     Compression level for all compressed files created (e.g. BAM and GELI).  Default value:5.

 

ADD REPLYlink written 2.7 years ago by Pierre Lindenbaum102k

You saved me from looking at the documentation :)

ADD REPLYlink written 2.7 years ago by Devon Ryan73k
0
gravatar for tszn1984
2.7 years ago by
tszn198490
United States
tszn198490 wrote:

use pipe '|' to avoid reading and writing huge files.

>bam2fastx -q in.bam | bowtie -S /dev/stdin | samtools view -Sbh - >out.bam

ADD COMMENTlink modified 2.7 years ago by Devon Ryan73k • written 2.7 years ago by tszn198490

I edited your post to make it a bit more correct. Note that if  paired-end reads are being used here that that won't work (and don't forget orphaned reads...).

ADD REPLYlink modified 2.7 years ago • written 2.7 years ago by Devon Ryan73k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1491 users visited in the last hour