How To Scale Down "Depth Of Coverage"
3
2
Entering edit mode
11.0 years ago
HG ★ 1.2k

Hi everyone my sample Depth of coverage around 300x now i want to make 250x, 200x, 150x, 100x . Can any one suggest some tools or packages to do such work ?

Thank you advance.

coverage • 4.5k views
ADD COMMENT
0
Entering edit mode

Hi HG. What format are your data in? At what step of your analysis do you want to reduce your coverage? Give us more details so that we can better help you.

ADD REPLY
0
Entering edit mode

HI Eric Thanks for reply. My data set : illumina 250bp pair end reads, whole genome sequencing of E.coil, Exp genome size 5.00mb. Now i already assemble the raw data which is around 300x coverage. Now i want to see if coverage reduce what will be quality of assembly mainly N50 value. For more information i just want to follow my assembly like GAGE-B http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3702249/ In this paper they assemble their data in different coverage. I just want to see the same effect of my own data set .

ADD REPLY
4
Entering edit mode
11.0 years ago
chefer ▴ 350

If your reads are in SAM/BAM format, you can also the Picard DownsampleSam tool. You then provide a probability of a read being retained during sampling.

ADD COMMENT
2
Entering edit mode
11.0 years ago
aawitney ▴ 20

You can use seqtk:

https://github.com/lh3/seqtk

something like this to select 1000000 reads (you will need to calculate how many reads would be needed for 250x etc):

seqtk sample -s100 .my.fastq.gz 1000000 | gzip > my.1.fastq.gz

ADD COMMENT
0
Entering edit mode

Thank you so much for your suggestion

ADD REPLY
1
Entering edit mode
11.0 years ago

GATK also allows downsampling. in fact it does it always by default. you can use the PrintReads walker if downsampling is your only goal.

as Mick has always stated: "you can't always get what you want, but if you try the '-ds' option, well you just might find, you get what you need"

ADD COMMENT

Login before adding your answer.

Traffic: 1061 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6