Entering edit mode
6.2 years ago
sieva1
▴
20
I would love to hear from the community which is the best/most widely used tool in bio-informatics to distinguish between coding and non-coding RNA.
In the FEELnc paper they are using some dataset and getting around 92% accuracy on a balanced 50/50 binary classification task. That seems to be about the usual performance for those algorithms, unless I am wrong. Would there be a great interest from the community for an algorithm that reaches say 94/95% or is it of little consequence given the current use?
Well, that really depends on your use case. Most papers that use these extensively are doing de novo lncRNA annotations, in which case a few extra % probably isn't going to mean much to them. However, micropeptides are beginning to emerge as products of transcripts traditionally thought of as non-coding, so I think tools to help distinguish those from conventional non-coding transcripts may be more useful than optimizing an algorithm for 1-2% greater accuracy.
Thanks for the insight. Coming from a CS background I am working on a neural network approach to lncRNA detection. Ideally would like to find co-authors for a paper in this area. Probably a topic for another post.