Rarefaction/Saturation Curve Based On Ngs Data
2
6
Entering edit mode
13.4 years ago
Biogenomics ▴ 60

Hi all,

This is most likely a simple question, but I'm looking for a tool (software, python/Perl/R script) that would produce a rarefaction curve based on an assembly file (ACE format would be easiest) to assess the number of reads needed to yield all observed contigs (cfr species diversity index). This would most likely be done through sampling reads within the ACE file and aligning them on the assembled contigs. I am interested to compare such rarefaction curves for data produced from normalized and non normalized libraries.

Alternatively, what approach would you use to automate (or semi-automate) such a task?

thanks

Greg

• 7.4k views
ADD COMMENT
0
Entering edit mode

Hello greg, were you able to find some tool/script for the analysis? Could you let us know if you were able to?

ADD REPLY
2
Entering edit mode
13.2 years ago

As you allude to, your problem is related to species richness calculations, so perhaps you could have a look at how to pose your problem in those terms and use rarefaction functions in a meta-genomics suite like mothur. Another option would be to pinch functions from the mothur source code and adapt to your problem.

ADD COMMENT
0
Entering edit mode

Mothur really works! I like it.

ADD REPLY
0
Entering edit mode

Hi jarrentinha, i am new to this kind of analysis. If possible, could you let us know how mothur can be used to plot the saturation curve between number of reads and number of genes?

ADD REPLY
0
Entering edit mode
9.3 years ago

As this topic was raised again, I would recommend reading Colwell et al on this topic. I would also like to ask what is the input format? If you have a simple frequency table, say

150 genes have 1 read

100 genes have 2 reads

...

1 gene has 6534 reads

...

1 gene has 20000 reads

I could share some code to build those rarefaction curves (and I think there are also a plenty of ecology-related software packages). Or you can adapt code from here: https://github.com/mikessh/vdjtools/blob/master/src/main/groovy/com/antigenomics/vdjtools/diversity/ChaoEstimator.groovy

ADD COMMENT

Login before adding your answer.

Traffic: 3040 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6