5.8 years ago
bitpir ▴ 240

Here's the problem I'm trying to solve: Given a pool of gene ID and their read counts -- some genes are unique to one particular species while some are shared--what is the most likely combination of species that can be present in the sample and how to calculate their relative abundance?

It seems like this problem requires some form of probabilistic inference but I wasn't sure what tools there are out there.I also heard about mixed-integer programming might help in this...If anyone knows of any tools/resources on ways to solve this kind of problem, please let me know!

I have a feeling that there are some important background details missing here. I mean, the generic answer to your problem is "use expectation-maximization and forget the integer aspect", but whether that's (A) really getting you what you think you need or (B) an efficient way to get you what you actually need remains to be seen.


