Mykrobe predictor Supplementary Training/Validation Dataset with empty cells
0
0
Entering edit mode
7.6 years ago
Penny Liu ▴ 30

Why so many medication list with no phenotype (empty blocks), even Sequence Type (ST) and Clol Complex (CC)?

Overview of data sets used for training/validation of species-identification and resistance prediction. https://docs.google.com/spreadsheets/d/1BqOyIPfKWXZtUcvYpm6ZvKaDKb8Vbmkjh3hO4Bxf0_Y/edit?usp=sharing

Supplementary information (Text files) was pulled down from the following link. http://www.nature.com/ncomms/2015/151221/ncomms10063/full/ncomms10063.html#/supplementary-information

phenotype drug resistant variant missing-data • 1.5k views
ADD COMMENT
1
Entering edit mode

Your question apparently lacks context, however it looks like the article or suppl. itself should explain why there is missing data. On a general note, missing data is very common in data analysis, machine learning, and biological experiments. The reasons are simply that the measurements have not been obtained, either because of measurement error, lack of more sampling material, or the outcome was not clear. Analyses and predictions just have to deal with this.

Sensitive/Resistant should be easy to measure, but maybe the assay didn't give definitive results for all combinations. You would have to dig deeper into the paper, but whatever you find, this explanation is relevant for a biological assay, and not strictly relevant for a computational analysis, so nothing we can really help with.

Edit: Sorry the article is about a prediction method, still you need to read it completely to possible find the answer to your question, however, the data set you are showing is possibly training data derived from bioassays, so my comment above still is valid.

ADD REPLY
0
Entering edit mode

Thanks so much for your detailed answer. I'll take an in-depth look at which specific features are in the training and test sets.

ADD REPLY

Login before adding your answer.

Traffic: 2497 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6