Question

Best way to generate rarefaction curves from 16S/18S data-reclustering?

0

Entering edit mode

8.4 years ago

ben83 ▴ 50

We have N=a few million 18S reads from some environment. The reads have been clustered into OTUs, and the OTUs annotated against a reference database.

To generate a rarefaction curve, my understanding is that one randomly samples n reads where n ranges from 0 to N with some interval, and counts the number of OTUs observed at each such sub-sampling.

In standard practice--as implemented by suites such as qiime and mothur--which of the two ways I can think of to do this is employed?

Treat the original assignments of reads to OTUs as truth, and when resampling n reads, just count the number of "original" OTUs observed in this sub-sample.
Re-cluster the sub-sampled reads, and then count the number of "new" OTUs in the sub-sample.

My sense from reading through the qiime documentation is that #1 is what is done, but I'm not positive. I'm also not quite sure why #2 wouldn't be the way to go (though of course it would be computationally more expensive).

Thanks

mothur ecology qiime rarefaction • 4.3k views

ADD COMMENT • link updated 21 months ago by Ram 43k • written 8.4 years ago by ben83 ▴ 50

Ram · Answer 1 · 2015-12-30

0

Entering edit mode

8.4 years ago

marina.v.yurieva ▴ 570

You are right, QIIME scripts for rarefaction plots use #1 http://qiime.org/scripts/multiple_rarefactions.html

#2 would be a way to go except it takes much longer and you'll have to do it yourself (there are no scripts for it) but go ahead if you have time.

ADD COMMENT • link updated 4.4 years ago by Ram 43k • written 8.4 years ago by marina.v.yurieva ▴ 570

0

Entering edit mode

Thanks for this answer.

ADD REPLY • link 8.4 years ago by ben83 ▴ 50