filtered.bcf file much larger than raw.bcf file
1
0
Entering edit mode
6.2 years ago
PKSWEE11 • 0

After using mPileup (Sam Tools v. 1.6) I get files that are in the neighborhood of 12 GB. Next, I use Bcftools filter (bcftools v. 1.6) to filter out variants with read depth less than 2 or greater than 100. The filtered files are around 85 GB! Why are the filtered files so much larger? After all I am filtering out sequence right? Here's the bcftools filter script

bcftools filter --threads 20 raw.bcf -e 'DP<2 || DP>100' > filtered.bcf

Any thoughts would be appreciated. Thanks

sequencing alignment • 1.6k views
ADD COMMENT
2
Entering edit mode
6.2 years ago

your generating a VCF file (plain text), not a binary (small) BCF.

see options:

-o, --output <file>           write output to a file [standard output]
-O, --output-type <b|u|z|v>   b: compressed BCF, u: uncompressed BCF, z: compressed VCF, v: uncompressed VCF [v]
ADD COMMENT
0
Entering edit mode

Thanks! That was very helpful

ADD REPLY

Login before adding your answer.

Traffic: 3001 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6