Question

Rescaling after relative abundance filtering of gut microbiome data

0

Entering edit mode

7 hours ago

yesquokkan • 0

Hello, I am currently working with shotgun metagenomic sequencing data from the human gut microbiome, and I aim to construct a co-occurrence network.

(1) To build a robust network, I plan to trim the relative abundance table by removing low-abundance or potentially noisy taxa. However, the cut-off thresholds used in different studies vary, and I assume the optimal cutoff may also depend on the number of taxa and samples in my dataset. What factors should I consider when deciding or adjusting this cutoff?

(2) I often see criteria such as "relative abundance >= x% in at least y% of samples." Does this mean that a taxon is retained if it exceeds x% relative abundance is more than y% of samples rather than referring to its mean relative abundance? In other words, taxa with relative abundance <= x% are treated as absent, and the y% prevalence threshold is then applied based on their presence across samples.

It that's the case, I've noticed that some studies use the mean relative abundance as the cutoff. Which criterion is more commonly used in practice?

(3) After filtering, the total sum of relative abundances will no longer equal 1, since some taxa have been removed. In this case, should I re-scale the remaining relative abundances so that they sum to 1 before calculating correlations between taxa?

(4) What is the typical number of bacteria species detected in human gut microbiome (shotgun metagenomic studies, taxonomy profiled with MetaPhlAn4)?

Thank you in advance.

microbiome gut network • 56 views

ADD COMMENT • link 7 hours ago by yesquokkan • 0