Best way to generate rarefaction curves from 16S/18S data-reclustering?
Entering edit mode
7.7 years ago
ben83 ▴ 50

We have N=a few million 18S reads from some environment. The reads have been clustered into OTUs, and the OTUs annotated against a reference database.

To generate a rarefaction curve, my understanding is that one randomly samples n reads where n ranges from 0 to N with some interval, and counts the number of OTUs observed at each such sub-sampling.

In standard practice--as implemented by suites such as qiime and mothur--which of the two ways I can think of to do this is employed?

  1. Treat the original assignments of reads to OTUs as truth, and when resampling n reads, just count the number of "original" OTUs observed in this sub-sample.
  2. Re-cluster the sub-sampled reads, and then count the number of "new" OTUs in the sub-sample.

My sense from reading through the qiime documentation is that #1 is what is done, but I'm not positive. I'm also not quite sure why #2 wouldn't be the way to go (though of course it would be computationally more expensive).


mothur ecology qiime rarefaction • 4.0k views
Entering edit mode
7.7 years ago

You are right, QIIME scripts for rarefaction plots use #1

#2 would be a way to go except it takes much longer and you'll have to do it yourself (there are no scripts for it) but go ahead if you have time.

Entering edit mode

Thanks for this answer.


Login before adding your answer.

Traffic: 1315 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6