Entering edit mode
20 months ago
Sneha
•
0
Hi!
I have carried out SMILE enumeration for data augmentation in ML model. Originally, I had 300 SMILES and have augmented it to 10-fold, thus resulting in 3000 SMILES (Reference Article: Bjerrum, Esben Jannik. "SMILES enumeration as data augmentation for neural network modeling of molecules." arXiv preprint arXiv:1703.07076 (2017)). I want to use the augmented data to train a model to predict IC50 values. So, for the enumerated SMILES, should I be using the IC50 value of its respective parent SMILES? Kindly guide me. Thanks in advance!