How to divide our own dataset into test, dev and train data for fine tuning process in DNABERT
0
0
Entering edit mode
13 months ago

I have a dataset which consists of 6-mers only. I want that dataset to have labels assigned and then divide my dataset into test, dev and train data for fine tuning process in DNABERT pipeline. Can anyone please tell me how to do it? Logic also works for me. Thanks!

kmers file looks like these: TTTTCT TGTTTT ATTGCC ACTAGT CTCTAG TCAGTG TGTTAA TCTTAT AACCAG AACTCA ATCATA CACTAA TTCTTT CACACG TGGTGT TTATTA CCCTGA CAAAGT TTTCAG ATCCTC AGTTTT ACATTC AACTCA GGACTT GTTCTT ACCTTT CTTTTC CAATGT TACTTG

NLP • 442 views
ADD COMMENT

Login before adding your answer.

Traffic: 1497 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6