Large human gut microbiome studies for machine learning
1
0
Entering edit mode
2.5 years ago
genseq • 0

Machine learning (especially neural nets) is well suited for large datasets and works poorly with small ones. When it comes to 16S human gut microbiome (HGM) studies, it is of high importance to find a relatively large dataset with long 16S sequenceing region and extensive metadata.

As of now, I only know of such large studies as HMP, AGP, UK Twins and TEDDY.

However, all of them have their drawbacks either in terms of sample count (HMP), participants age range (TEDDY), availability of metadata (UK Twins), length of sequencing region (AGP) etc.

Are there other large HGM datasets suitable for studying with neural nets?

gut HGM human microbiota microbiome • 712 views
ADD COMMENT
0
Entering edit mode

Nothing else? :(

ADD REPLY
2
Entering edit mode
2.5 years ago
Mensur Dlakic ★ 27k

I think your question is too broad in scope, yet without enough details. ANY dataset can be studied by ANY machine learning method - it is a matter of which one will be most effective. Neural networks (NNs) are superior when it comes to studying images, and especially so for large image datasets, but not necessarily on all large datasets. With purely numerical datasets, or numerical + categorical datasets, I have had better results with gradient boosting methods than with NNs. I make it a point to use multiple methods for all machine learning problems, because it is difficult to know which one will work best.

Whether NNs are best for your purpose depends on data representation you have. It is quite possible that a smaller dataset with better data representation will work better with NNs than larger datasets. That said, how many datasets do you need for your purpose? Many people will work with a single dataset, and you seem to have a handful already. If you insist on finding more datasets, you may have to settle for a smaller number of data points. For those, simpler methods such as generalized linear models may be the best choice.

ADD COMMENT

Login before adding your answer.

Traffic: 2622 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6