Subsampling reads
1
1
Entering edit mode
2.9 years ago
amy ▴ 20

Hi! I hope you can help me, I'm relatively new to bioinformatics.

I plan to use an (already) assembled metagenome data for binning but the microbial species/population that I need to generate from the bins is not well represented in the assembled metagenome data (only at ~3%). Can I use the assembled metagenome for subsampling the reads like at varying percentage of reads coverage? If so, what protocol/software/tools you can recommend? Thank you so much.

metagenome metagenomics microbial • 1.3k views
ADD COMMENT
0
Entering edit mode

the microbial species/population that I need to generate from the bins is not well represented in the assembled metagenome data (only at ~3%

How is this related to your request for sub-sampling?

You can use reformat.sh for sub-sampling as a general suggestion. Extracting randomly subset of fastq reads from a huge file

ADD REPLY
0
Entering edit mode

Ideally, the goal is to recover a cyanobacterial genome from the metagenome data. But the population of the cyanobacteria is only at 3% in the sample, Proteobacteria being the most abundant. I've read some articles that did random sub-sampling at different percent reads coverage from a huge metagenome data. Thank you for your response!

ADD REPLY
1
Entering edit mode
2.9 years ago
Mensur Dlakic ★ 27k

If you want to recruit the relevant reads only and re-assemble, mirabait from the MIRA package can do it.

Random sub-sampling in general doesn't work for bins that are of low abundance. You may want to try the normalize-by-median.py approach from the khmer package. Given the low abundance, you may not have enough coverage depth to assemble any better than what you already have.

ADD COMMENT
0
Entering edit mode

Thank you Sir! I will try this strategy.

ADD REPLY

Login before adding your answer.

Traffic: 1986 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6