Rarefaction curve at different sequencing depths
5.2 years ago
bioinfo ▴ 830

I have a table which contains the number of different taxa in different samples from metagenomics data.

Sample      seq_depth       16S counts   taxa1     taxa2    taxa3    taxa4
1           55000            230        30        39      40       12
2           72000            300        23        53      20       64
3           137000           540        20        135     12       94
4           84000            250        48        37      102      74


I want to create a rarefaction curve based on that. What's the best way to do that? I have looked at the Qiime multiple_rarefaction.py) but the script doesn't take into consideration of the column B (Seq_depth) to create the rarefaction curve (we can ignore column C - 16S counts here). I was wondering when we use the input OTU table in Qiime do we provide sequencing depth info at all in a column or it just take Sample id and taxa columns?

In addition, if I want to create a new taxa counts table at an even sequencing depths (e.g. 50000) or at even 16S counts (e.g. 230) for all 4 samples. Is there a strategy to do that?

rarefaction qiime metagenomics 16S • 5.8k views
5.2 years ago

The point of the rarefaction curve is to estimate the species richness as a function of sampling (sequencing depth). A higher sequencing depth will only make the curve go on longer but otherwise is comparable to a lower sequencing depth curve for the regions that both cover.

Cut the last values off if you really want both to go the same distance - though it would make no sense to do so - the whole point of the rarefaction is to assess how well the sampling covers the actual richness. There is little information to be had from tossing that.