Question: How to prepare Genomic data(Genbank, Fasta) for machine learning
gravatar for
17 months ago by wrote:

I want to prepare a genomic data set for neural network and other machine learning methods. I have set of genes and there sequence. How do I prepare my data set for this. How do i annotate various regions of the sequence in order to prepare the data sets. I case of any other numeric tabular data sets with various columns attributes it is very easy to build data set for neural network but i have no clue for genomic data sets.

R rna-seq gene genome • 614 views
ADD COMMENTlink modified 15 months ago by Biostar ♦♦ 20 • written 17 months ago by

Your question is way too broad. At least, you should know

  1. what do you want to predict?
  2. what are the related features?
  3. How can you quantify these features?

It's all case dependent.

ADD REPLYlink written 17 months ago by shoujun.gu370

Please use ADD COMMENT or ADD REPLY to ask additional questions, as such this thread remains logically structured and easy to follow. I have now moved your post but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.

ADD REPLYlink modified 17 months ago • written 17 months ago by WouterDeCoster38k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1513 users visited in the last hour